N-acetylglycosaminyl transferase genes

ABSTRACT

N-acetylglycosaminyltransferase V nucleic acids, proteins encoded by the nucleic acids, and uses of the nucleic acids and proteins.

FIELD OF THE INVENTION

[0001] The invention relates to novel N-acetylglycosaminyltransferase Vnucleic acids, proteins encoded by the nucleic acids, and uses of thenucleic acids and proteins.

BACKGROUND OF THE INVENTION

[0002] Protein glycosylation is mediated by a series of enzymes found inthe Golgi apparatus. Many of the enzymes in this pathway are subject toregulation during embryogenesis, lymphocyte activation, and in cancerprogression. Structural diversity of carbohydrates on cell surfaces andsecreted or non-secreted (e.g. receptors) proteins affects theirfunction and the associated cell biology. Somatic mutations and drugswhich block the biosynthesis of -GlcNAcβ1-6Manα1-6Manβ-branching ofN-linked oligosaccharides,also inhibit organ colonisation, invasion invitro, and limit solid tumor growth in vivo.

[0003] Synthesis of GlcNAc-branched carbohydrate structure is dependentupon N-acetylglycosaminyltransferases, one of which isN-acetylglycosaminyltransferase V (GlcNAc-TV). GlcNAc-TV catalyzes theaddition of 1-6GlcNAc to the trimannosyl core in the biosyntheticpathway for branched complex-type N-linked oligosaccharides found onsome cell surface and secreted glycoproteins (Schachter, H. (1986)Biochem. Cell. Biol. 64: 163-181). The 1-6GlcNAc product of GlcNAc-TV isthe preferred antenna and rate limiting substrate in the pathway foraddition of terminal polylactosamine sequences which affect cell-celland cell-substratum interactions (van den Eijnden, D. H. et al, (1988)263:12461-12465; Yousefi, S. et al, (1991) J. Biol Chem. 266:1772-1783;and Heffernan, M. et al, (1993) J. Biol. Chem. 268:1242-1251).

[0004] The rat (Shoreibah, M. et al (1993), 268:15381-15385) and human(Saito, H. et al. (1994), Biochem. Biophys. Res. Commun. 198:318-327233:18-26) GlcNAc-TV sequences predict a 741 amino acid type IIglycoprotein. The human GlcNAc-TV gene is located on human chromosome2q21 with 17 exons and spans 155 Kb (Saito et al., (1995) Eur. J.Biochem. 233: 18-26). The putative promoter region of the GlcNAc-TV genehas API and PEA3/ets binding sites, and is responsive to ras signalingpathways (Buckhaults P J Biol Chem (1997) 272:19575-19581).

[0005] Oncogenic transformation of rodent fibroblasts by polyoma virus,v-src, H-ras or v-fps leads to increased GlcNAc-TV expression(Yamashita, K. et al, (1985) J. Biol. Chem. 260:3963-3969; Pierce, M andArango, J. (1986) J. Biol. Chem. 261: 10772-10777; Dennis, J et al.(1987) Science 236:;582-585, 1987), and in human carcinomas of breast,colon and skin GlcNAcTV-generated structures correlate with pathologicalstaging of tumors (Fernandes, B. et al (1991) 51:718-723). The GlcNAc-TVmessage is also subject to increased frequency of alternate splicing intumors cells, resulting in a peptide encoded by an intron sequence ofthe GlcNAc-TV gene which has been identified as a widely occurring“tumor-associated antigen”. Fifty percent of tested human melanomatumors expressed this antigen, while it is absent in normal tissues(Guilloux, Y. et al (1996) J. Exp. Med. 183:1173-1183). In a rat modelof heritable liver cancer, GlcNAc-TV transcript levels are elevated inprimary tumors and lymph node metastases (Miy shi, e. et al, (1993)Cancer Res. 53:3899-3902, 1993). In addition, topical expression ofGlcNAc-TV in epithelial cells results in morphological transformationand tumorogenesis (Demetriou, M. et al (1995) J. Cell Biol. 130:383),while tumor cell mutants selected for loss of GlcNAc-TV activity showreduced malignant potential in vivo (Lu, Y. et al (1994) Clin. Exp.Metastasis 12:47-54).

SUMMARY OF THE INVENTION

[0006] The present inventors have identified novel GlcNAc-TV nucleicacid molecules. The nucleic acids are herein designated “glcNAc-TV-b” or“GlcNAc-TV-b nucleic acid molecule” and “glcNAc-TV-c” or “GlcNAc-TV-cnucleic acid molecule”. The proteins encoded by the nucleic acidmolecules are herein designated “GlcNAc-TV-b” or “GlcNAc-TV-b Protein”,and “GlcNAc-TV-c” or “GlcNAc-TV-c Protein”.

[0007] Broadly stated the present invention contemplates an isolatednucleic acid molecule encoding a protein of the invention, includingmRNAs, DNAs, cDNAs, genomic DNAs, PNAs, as well as antisense analogs andbiologically, diagnostically, prophylactically, clinically ortherapeutically useful variants or fragments thereof, and compositionscomprising same.

[0008] In particular, the present invention contemplates an isolatedGlcNAc-TV-b or GlcNAc-TV-c nucleic acid molecule comprising a sequencethat comprises at least 18 nucleotides and hybridizes under stringentconditions to the complementary nucleic acid sequence of SEQ. ID. NO. 1,or a degenerate form thereof. Further embodiments of this aspect of theinvention provide biologically, diagnostically, prophylactically,clinically or therapeutically useful variants thereof and compositionscomprising same.

[0009] The invention also contemplates an isolated GlcNAc-TV-b orGlcNAc-TV-c protein encoded by a nucleic acid molecule of the invention,a truncation, an analog, an allelic or species variation thereof, or ahomolog of a protein of the invention, or a truncation thereof.(Truncations, analogs, allelic or species variations, and homologs arecollectively referred to herein as “GlcNAc-TV-b Related Proteins” or“GlcNAc-TV-c Related Proteins).

[0010] The nucleic acid molecules of the invention permit identificationof untranslated nucleic acid sequences or regulatory sequences whichspecifically promote expression of genes operatively linked to thepromoter regions. Identification and use of such promoter sequences areparticularly desirable in instances, such as gene transfer or genetherapy, which can specifically require heterologous gene expression ina limited environment (e.g. CNS environment). The invention thereforecontemplates a nucleic acid encoding a regulatory sequence of a nucleicacid molecule of the invention such as a promoter sequence, preferably aregulatory sequence of glcNAc-TV-b or glcNAc-TV-c.

[0011] The nucleic acid molecules which encode for a mature GlcNAc-TV-bor GlcNAc-TV-c protein may include only the coding sequence for themature polypeptide (SEQ ID NO. 5 or 9); the coding sequence for themature polypeptide and additional coding sequences (e.g. leader orsecretory sequences, proprotein sequences); the coding sequence for themature polypeptide (and optionally additional coding sequence) andnon-coding sequence, such as introns or non-coding sequences 5′ and/or3′ of the coding sequence of the mature polypeptide (e.g. SEQ ID NO. 3).

[0012] Therefore, the term “nucleic acid molecule encoding a protein”encompasses a nucleic acid molecule which includes only coding sequencefor the protein as well as a nucleic acid molecule which includesadditional coding and/or non-coding sequences.

[0013] The nucleic acids of the invention may be inserted into anappropriate expression vector, and the vector may contain the necessaryelements for the transcription and translation of an inserted codingsequence. Accordingly, recombinant expression vectors may be constructedwhich comprise a nucleic acid molecule of the invention, and whereappropriate one or more transcription and translation elements linked tothe nucleic acid molecule.

[0014] Vectors are contemplated within the scope of the invention whichcomprise regulatory sequences of the invention, as well as chimeric geneconstructs wherein a regulatory sequence of the invention is operablylinked to a nucleic acid sequence encoding a heterologous protein (i.e.a protein not naturally expressed in the host cell), and a transcriptiontermination signal.

[0015] A recombinant expression vector can be used to transform hostcells to express a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Proteins, aGlcNAc-TV-c Protein, or GlcNAc-TV-c Related Proteins, or a heterologousprotein. Therefore, the invention further provides host cells containinga recombinant molecule of the invention. The invention also contemplatestransgenic non-human mammals whose germ cells and somatic cells containa recombinant molecule comprising a nucleic acid molecule of theinvention in particular one that encodes an analog of GlcNAc-TV-b orGlcNAc-TV-c, or a truncation of GlcNAc-TV-b or GlcNAc-TV-c.

[0016] The proteins of the invention may be obtained as an isolate fromnatural cell sources, but they are preferably produced by recombinantprocedures. In one aspect the invention provides a method for preparinga GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein utilizing the purified andisolated nucleic acid molecules of the invention. In an embodiment amethod for preparing a GlcNAc-TV-b Protein, a GlcNAc-TV-b RelatedProtein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein isprovided comprising:

[0017] (a) transferring a recombinant expression vector of the inventionhaving a nucleotide sequence encoding a GlcNAc-TV-b Protein, aGlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-cRelated Protein, into a host cell;

[0018] (b) selecting transformed host cells from untransformed hostcells;

[0019] (c) culturing a selected transformed host cell under conditionswhich allow expression of the GlcNAc-TV-b Protein, GlcNAc-TV-b RelatedProtein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein; and

[0020] (d) isolating the GlcNAc-TV-b Protein, GlcNAc-TV-b RelatedProtein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

[0021] The invention further broadly contemplates a recombinantGlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein obtained using a method of the invention.

[0022] A GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, aGlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the inventionmay be conjugated with other molecules, such as proteins, to preparefusion proteins or chimeric proteins. This may be accomplished, forexample, by the synthesis of N-terminal or C-terminal fusion proteins.

[0023] The invention further contemplates antibodies having specificityagainst an epitope of a GlcNAc-TV-b Protein, a GlcNAc-TV-b RelatedProtein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of theinvention. Antibodies may be labeled with a detectable substance andused to detect proteins of the invention in biological samples, tissues,and cells.

[0024] The invention also permits the construction of nucleotide probeswhich are unique to the nucleic acid molecules of the invention or toproteins of the invention. Therefore, the invention also relates to aprobe comprising a sequence encoding a protein of the invention, or apart thereof. The probe may be labeled, for example, with a detectablesubstance and it may be used to select from a mixture of nucleotidesequences a nucleic acid molecule of the invention including nucleicacid molecules coding for a protein which displays one or more of theproperties of a protein of the invention.

[0025] In accordance with an aspect of the invention there is provided amethod of, and products for, diagnosing and monitoring conditionsmediated by a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, aGlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein by determining thepresence of nucleic acid molecules and proteins of the invention.

[0026] Still further the invention provides a method for evaluating atest compound for its ability to modulate the biological activity of aGlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein of the invention. For examplea substance which inhibits or enhances the catalytic activity of aGlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein may be evaluated. “Modulate”refers to a change or an alteration in the biological activity of aprotein of the invention. Modulation may be an increase or a decrease inactivity, a change in characteristics, or any other change in thebiological, functional, or immunological properties of the protein.

[0027] Compounds which modulate the biological activity of a protein ofthe invention may also be identified using the methods of the inventionby comparing the pattern and level of expression of a nucleic acidmolecule or protein of the invention in tissues and cells, in thepresence, and in the absence of the compounds.

[0028] Methods are also contemplated that identify compounds orsubstances (e.g. proteins) which bind to glcNAc-TV-b or glcNAc-TV-cregulatory sequences (e.g. promoter sequences, enhancer sequences,negative modulator sequences).

[0029] The substances and compounds identified using the methods of theinvention may be used to modulate the biological activity of aGlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein of the invention, and they maybe used in the treatment of conditions mediated by the proteinsincluding but not limited to proliferative diseases such as cancer,viral, bacterial, and parasitic infections, to stimulate hematopoieticprogenitor cell growth, or confer protection against chemotherapy orradiation therapy. Accordingly, the nucleic acid molecules and proteinsof the invention, and substances and compounds may be formulated inticompositions for administration to individuals suffering from one ormore of these conditions. Therefore, the present invention also relatesto a composition comprising one or more of a nucleic acid molecule orprotein of the invention, or a substance or compound identified usingthe methods of the invention, and a pharmaceutically acceptable carrier,excipient or diluent. A method for treating or preventing theseconditions is also provided comprising administering to a patient inneed thereof, a composition of the invention.

[0030] The present invention provides the means necessary for productionof gene-based therapies directed at the brain. These therapeutic agentsmay take the form of polynucleotides comprising all or a portion of anucleic acid of the invention comprising a regulatory sequence ofglcNAc-TV-b or glcNAc-TV-c placed in appropriate vectors or delivered totarget cells in more direct ways.

[0031] Having provided novel GlcNAc TV proteins, and nucleic acidsencoding same, the invention accordingly further provides methods forpreparing oligosaccharides e.g. two or more saccharides. In specificembodiments, the invention relates to a method for preparing anoligosaccharide comprising contacting a reaction mixture comprising anactivated GlcNAc, and an acceptor in the presence of a GlcNAc-TV-bProtein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or aGlcNAc-TV-c Related Protein of the invention.

[0032] In accordance with a further aspect of the invention, there areprovided processes for utilizing proteins or nucleic acid molecules ofthe invention, for in vitro purposes related to scientific research,synthesis of DNA, and manufacture of vectors.

[0033] These and other aspects, features, and advantages of the presentinvention should be apparent to those skilled in the art from thefollowing drawings and detailed description.

DESCRIPTION OF THE DRAWINGS

[0034] The invention will be better understood with reference to thedrawings in which:

[0035]FIG. 1 is a reproduction of autoradiograms resulting from aNorthern hydridization experiment in which mRNA isolated from differenthuman tissues was sized-fractionated and probed with radioactive humanpartial GlcNAc-TV clone (nucleotides 1508-1921) and human partialGlcNAc-TV-b (nucleotides 1959-2417);

[0036]FIG. 2 is a reproduction of autoradiograms resulting from aNorthern hybridization experiment in which mRNA isolated from differenthuman brain tissues was size-fractionated and probed with radioactivehuman partial GlcNAc-TV clone (nucleotides 1508-1921) and human partialGlcNAc-TV-b (nucleotides 1959-2417); and

[0037]FIG. 3 is a reproduction of phosphoimager resulting from aNorthern hybridization experiment in which mRNA isolated from differenthuman tumor cell lines was size-fractionated and probed with radioactivehuman partial GlcNAc-TV clone (nucleotides 1508-1921) and human partialGlcNAc-TV (nucleotides 1959-2417).

DETAILED DESCRIPTION OF THE INVENTION

[0038] In accordance with the present invention there may be employedconventional molecular biology, microbiology, and recombinant DNAtechniques within the skill of the art. Such techniques are explainedfully in the literature. See for example, Sambrook, Fritsch, & Maniatis,Molecular Cloning: A Laboratory Manual, Second Edition (1989) ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); DNA Cloning:A Practical Approach. Volumes I and II (D. N. Glover ed. 1985);Oligonucleotide Synthesis (M. J. Gait ed. 1984); Nucleic AcidHybridization B. D. Hames & S. J. Higgins eds. (1985); Transcription andTranslation B. D. Hames & S. J. Higgins eds (1984); Animal Cell CultureR. I. Freshney, ed. (1986); Immobilized Cells and enzymes IRL Press,(1986); and B. Perbal, A Practical Guide to Molecular Cloning (1984).

Nucleic Acid Molecules of the Invention

[0039] As hereinbefore mentioned, the invention provides isolatedGlcNAc-TV-b and GlcNAc-TV-c nucleic acid molecules. The GlcNAc-TV-b andGlcNAc-TV-c nucleic acid molecules differ in their 3′ ends.

[0040] The term “isolated” refers to a nucleic acid (or protein) removedfrom its natural environment, purified or separated, or substantiallyfree of cellular material or culture medium when produced by recombinantDNA techniques, or chemical reactants, or other chemicals whenchemically synthesized. Preferably, an isolated nucleic acid molecule isat least 60% free, more preferably at least 75% free, and mostpreferably at least 90% free from other components with which they arenaturally associated. The term “nucleic acid” is intended to includemodified or unmodified DNA, RNA, including mRNAs, DNAs, cDNAs, andgenomic DNAs, or a mixed polymer, and can be either single-stranded,double-stranded or triple-stranded. For example, a nucleic acid sequencemay be a single-stranded or double-stranded DNA, DNA that is a mixtureof single-and double-stranded regions, or single-, double- andtriple-stranded regions, single- and double-stranded RNA, RNA that maybe single-stranded, or more typically, double-stranded, ortriple-stranded, or a mixture of regions comprising RNA or DNA, or bothRNA and DNA. The strands in such regions may be from the same moleculeor from different molecules. The DNAs or RNAs may contain one or moremodified bases. For example, the DNAs or RNAs may have backbonesmodified for stability or for other reasons. A nucleic acid sequenceincludes an oligonucleotide, nucleotide, or polynucleotide. The term“nucleic acid molecule” and in particular DNA or RNA, refers only to theprimary and secondary structure and it does not limit it to anyparticular tertiary forms.

[0041] In an embodiment of the invention an isolated nucleic acid iscontemplated which comprises:

[0042] (i) a nucleic acid sequence encoding a protein having substantialsequence identity preferably at least 70%, more preferably at least 75%sequence identity, with an amino acid sequence of SEQ. ID. NO. 2, 4, 6,10, or 12;

[0043] (ii) nucleic acid sequences complementary to (i);

[0044] (iii) nucleic acid sequences differing from any of the nucleicacids of (i) or (ii) in codon sequences due to the degeneracy of thegenetic code;

[0045] (iv) a nucleic acid sequence comprising at least 18 nucleotidesand capable of hybridizing under stringent conditions to a nucleic acidsequence of SEQ. ID. NO. 1, 3, 5, 9, or 11 or to a degenerate formthereof;

[0046] (v) a nucleic acid sequence encoding a truncation, an analog, anallelic or species variation of a protein comprising an amino acidsequence f SEQ. ID. NO. 2, 4, 6, 10, or 12; or

[0047] (vi) a fragment, or allelic or species variatian of (i), (ii) or(iii)

[0048] In a specific embodiment, the isolated nucleic acid comprises:

[0049] (i) a nucleic acid sequence having substantial sequence identitypreferably at least 70%, more preferably at least 75% sequence identitywith a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11;

[0050] (ii) nucleic acid sequences complementary to (i), preferablycomplementary to a full nucleic acid sequence of SEQ. ID. NO. 1, 3, 5,9, or 11;

[0051] (iii) nucleic acid sequences differing from any of the nucleicacids of (i) to (ii) in codon sequences due to the degeneracy of thegenetic code; or

[0052] (iv) a fragment, or allelic or species variation of (i), (ii) or(iii).

[0053] The term “complementary” refers to the natural binding of nucleicacid molecules under permissive salt and temperature conditions bybase-pairing. For example, the sequence “A-G-T” binds to thecomplementary sequence “T-C-A”. Complementarity between twosingle-stranded molecules may be “partial”, in which only some of thenucleic acids bind, or it may be complete when total complementarityexists between the single stranded molecules.

[0054] In a preferred embodiment the isolated nucleic acid comprises anucleic acid sequence encoded by an amino acid sequence of SEQ. ID. NO.2, 4, 6, 10, or 12 or comprises a nucleotide sequence of SEQ. ID. NO. 1,3, 5, 9, or 11 wherein T can also be U.

[0055] The terms “sequence similarity” or “sequence identity” refers tothe relationship between two or more amino acid or nucleic acidsequences, determined by comparing the sequences, which relationship isgenerally known as “homology”. Identity in the art also means the degreeof sequence relatedness between amino acid or nucleic acid sequences, asthe case may be, as determined by the match between strings of suchsequences. Both identity and similarity can be readily calculated(Computational Molecular Biology, Lesk, A. M., ed., Oxford UniversityPress New York, 1988; Biocomputing: Informatics and Genome Projects,Smith, D. W. ed., Academic Press, New York, 1993; Computer Analysis ofSequence Data, Part I, Griffin, A. M., and Griffin, H. G. eds. HumanaPress, New Jersey, 1994; Sequence Analysis in Molecular Biology, vonHeinje, G., Academic Press, New York, 1987; and Sequence AnalysisPrimer, Gribskov, M. and Devereux, J., eds. M. Stockton Press, New York,1991). While there are a number of existing methods to measure identityand similarity between two amino acid sequences or two nucleic acidsequences, both terms are well known to the skilled artisan (SequenceAnalysis in Molecular Biology, von Heinje, G., Academic Press, New York,1987; Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds. M.Stockton Press, New York, 1991; and Carillo, H., and Lipman, D. SIAM J.Applied Math., 48:1073, 1988). Preferred methods for determiningidentity are designed to give the largest match between the sequencestested. Methods to determine identity are codified in computer programs.Preferred computer program methods for determining identity andsimilarity between two sequences include but are not limited to the GCGprogram package (Devereux, J. et al, Nucleic Acids Research 12(1): 387,1984), BLASTP, BLASTN, and FASTA (Atschul, S. F. et al., J. Molec. Biol.215:403, 1990). Identity or similarity may also be determined using thealignment algorithm of Dayhoff et al; Methods in Enzymology 91: 524-545(1983).

[0056] Preferably, a nucleic acid molecule of the present invention hassubstantial sequence identity using the preferred computer programscited herein, for example at least 70%, more preferably at least 75%nucleic acid identity; still more preferably at least 80% nucleic acididentity; and most preferably at least 90% to 95% sequence identity to asequence of SEQ. ID. NO. 1, 3, 5, 9, or 11.

[0057] Isolated nucleic acid molecules encoding a GlcNAc-TV-b Protein orGlcNAc-TV-c Protein, and having a sequence which differs from a nucleicacid sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11, due to degeneracy inthe genetic code are also within the scope of the invention. Suchnucleic acid molecules encode equivalent proteins but differ in sequencefrom a sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11 due to degeneracy inthe genetic code. As one example, DNA sequence polymorphisms withinglcNAc-TV-b or glcNAc-TV-c may result in silent mutations which do notaffect the amino acid sequence. Variations in one or more nucleotidesmay exist among individuals within a population due to natural allelicvariation. Any and all such nucleic acid variations are within the scopeof the invention. DNA sequence polymorphisms may also occur which leadto changes in the amino acid sequence of GlcNAc-TV-b Protein orGlcNAc-TV-c Protein. These amino acid polymorphisms are also within thescope of the present invention. In addition, species variations i.e.variations in nucleotide sequence naturally occurring among differentspecies, are within the scope of the invention.

[0058] Another aspect of the invention provides a nucleic acid moleculewhich hybridizes under selective conditions, e.g. high stringencyconditions, to a nucleic acid which comprises a sequence which encodes aGlcNAc-TV-b Protein or GlcNAc-TV-c Protein of the invention. Preferablythe sequence encodes an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10,or 12 and comprises at least 18 nucleotides. Selectivity ofhybridization occurs with a certain degree of specificity rather thanbeing random. Appropriate stringency conditions which promote DNAhybridization are known to those skilled in the art, or can be found inCurrent Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989),6.3.1-6.3.6. Numerous equivalent conditions comprising either low orhigh stringency depend on factors such as the length and nature of thesequence (DNA, RNA, base composition), nature of the target (DNA, RNA,base composition), milieu (in solution or immobilized on a solidsubstrate), concentration of salts and other components (e.g. formamide,dextran sulfate and/or polyethylene glycol), and temperature of thereactions (within a range from about 5° C. below the melting temperatureof the probe to about 20° C. to 25° C. below the melting temperature).One or more factors may be varied to generate conditions of either lowor high stringency different from, but equivalent to, the above listedconditions. For example, 6.0×sodium chloride/sodium citrate (SSC) or0.5% SDS at about 45° C., followed by a wash of 2.0×SSC at 50° C. may beemployed. The stringency may be selected based on the conditions used inthe wash step. By way of example, the salt concentration in the washstep can be selected from a high stringency of about 0.2×SSC at 50° C.In addition, the temperature in the wash step can be at high stringencyconditions, at about 65° C.

[0059] It will be appreciated that the invention includes nucleic acidmolecules encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein,GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein, includingtruncations of the proteins, allelic and species variants, and analogsof the proteins as described herein. In particular, fragments of anucleic acid molecule of the invention are contemplated that are astretch of at least about 10, preferably at least 15, more preferably atleast 18, and most preferably at least 20 nucleotides, more typically atleast 50 to 200 nucleotides but less than 2 kb. It will further beappreciated that variant forms of the nucleic acid molecules of theinvention which arise by alternative splicing of an mRNA correspondingto a cDNA of the invention are encompassed by the invention.

[0060] An isolated nucleic acid molecule of the invention whichcomprises DNA can be isolated by preparing a labeled nucleic acid probebased on all or part of a nucleic acid sequence of SEQ. ID. NO. 1, 3, 5,9, or 11. The labeled nucleic acid probe is used to screen anappropriate DNA library (e.g. a cDNA or genomic DNA library). Forexample, a cDNA library can be used to isolate a cDNA encoding aGlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein by screening the library withthe labeled probe using standard techniques. Alternatively, a genomicDNA library can be similarly screened to isolate a genomic cloneencompassing a glcNAc-TV-b or glcNAc-TV-c gene. Nucleic acids isolatedby screening of a cDNA or genomic DNA library can be sequenced bystandard techniques.

[0061] An isolated nucleic acid molecule of the invention which is DNAcan also be isolated by selectively amplifying a nucleic acid of theinvention. “Amplifying” or “amplification” refers to the production ofadditional copies of a nucleic acid sequence and is generally carriedout using polymerase chain reaction (PCR) technologies well known in theart (Dieffenbach, C. W. and G. S. Dveksler (1995) PCR Primer, aLaboratory Manual, Cold Spring Harbor Press, Plainview, N.Y.). Inparticular, it is possible to design synthetic oligonucleotide primersfrom a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 7, 8, 9, or 11 foruse in PCR. A nucleic acid can be amplified from cDNA or genomic DNAusing these oligonucleotide primers and standard PCR amplificationtechniques. The nucleic acid so amplified can be cloned into anappropriate vector and characterized by DNA sequence analysis, cDNA maybe prepared from mRNA, by isolating total cellular mRNA by a variety oftechniques, for example, by using the guanidinium-thiocyanate extractionprocedure of Chirgwin et al., Biochemistry, 18, 5294-5299 (1979). cDNAis then synthesized from the mRNA using reverse transcriptase (forexample, Moloney MLV reverse transcriptase available from Gibco/BRL,Bethesda, Md., or AMV reverse transcriptase available from SeikagakuAmerica, Inc., St. Petersburg, Fla.).

[0062] An isolated nucleic acid molecule of the invention which is RNAcan be isolated by cloning a cDNA encoding a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein into an appropriate vector which allows for transcription of thecDNA to produce an RNA molecule which encodes a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein. For example, a cDNA can be cloned downstream of a bacteriophagepromoter, (e.g. a T7 promoter) in a vector, cDNA can be transcribed invitro with T7 polymerase, and the resultant RNA can be isolated byconventional techniques.

[0063] Nucleic acid molecules of the invention may be chemicallysynthesized using standard techniques. Methods of chemicallysynthesizing polydeoxynucle tides are known, including but not limitedto solid-phase synthesis which, like peptide synthesis, has been fullyautomated in commercially available DNA synthesizers (See e.g., Itakuraet al. U.S. Pat. No. 4,598,049; Caruthers et al. U.S. Pat. No.4,458,066; and Itakura U.S. Pat. Nos. 4,401,796 and 4,373,071).

[0064] Determination of whether a particular nucleic acid molecule is aGlcNAc-TV-b or GlcNAc-TV-c or encodes a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein canbe accomplished by expressing the cDNA in an appropriate host cell bystandard techniques, and testing the expressed protein in the methodsdescribed herein. A GlcNAc-TV-b or GlcNAc-TV-c cDNA or cDNA encoding aGlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein can be sequenced by standard techniques,such as dideoxynucleotide chain termination or Maxam-Gilbert chemicalsequencing, to determine the nucleic acid sequence and the predictedamino acid sequence of the encoded protein.

[0065] The initiation codon and untranslated sequences of a nucleic acidmolecule of the invention may be determined using computer softwaredesigned for the purpose, such as PC/Gene (IntelliGenetics Inc.,Calif.). The intron-exon structure and the transcription regulatorysequences of a nucleic acid molecule of the invention and/or encoding aGlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein may be identified by using a nucleic acidmolecule of the invention to probe a genomic DNA clone library.Regulatory elements can be identified using standard techniques. Thefunction of the elements can be confirmed by using these elements toexpress a reporter gene such as the lacZ gene which is operativelylinked to the elements. These constructs may be introduced into culturedcells using conventional procedures or into non-human transgenic animalmodels. In addition to identifying regulatory elements in DNA, suchconstructs may also be used to identify nuclear proteins interactingwith the elements, using techniques known in the art.

[0066] In accordance with one aspect of the invention, a nucleic acid isprovided comprising a GlcNAc-TV-b regulatory sequence such as a promotersequence. In particular, an isolated nucleic acid molecule iscontemplated which comprises:

[0067] (i) a nucleic acid sequence having at least 75% sequence identitywith a sequence of SEQ. ID. NO. 7 or 8;

[0068] (ii) nucleic acid sequences complementary to (i),

[0069] (iii) nucleic acid sequences differing from any of the nucleicacids of (i) or (ii) in codon sequences due to the degeneracy of thegenetic code;

[0070] (iv) a nucleic acid sequence comprising at least 10, mostpreferably 18 nucleotides and capable of hybridizing under stringentconditions to a nucleic acid sequence of SEQ. ID. NO. 7 or 8, or to adegenerate form thereof;

[0071] (v) a fragment, or allelic or species variation of (i), (ii) or(iii).

[0072] In a preferred embodiment, the isolated nucleic acid comprises anucleic acid sequence of SEQ. ID. NO. 7 or 8, wherein T can also be U.

[0073] The invention contemplates nucleic acid molecules comprising allor a portion of a nucleic acid of the invention comprising a regulatorysequence of a glcNAc-TV-b gene or a glcNAc-TV-c gene (e.g. SEQ ID Nos: 7or 8) contained in appropriate vectors. The vectors may containheterologous nucleic acid sequences. “Heterologous nucleic acid” refersto a nucleic acid not naturally located in the cell, or in a chromosomalsite of the cell. Preferably, the heterologous nucleic acid includes anucleic acid foreign to the cell.

[0074] In accordance with another aspect of the invention, the nucleicacids isolated using the methods described herein are mutant glcNAc-TV-bor glcNAc-TV-c gene alleles. For example, the mutant alleles may beisolated from individuals either known or proposed to have a genotypewhich contributes to the symptoms of cancer. Mutant alleles and mutantallele products may be used in therapeutic and diagnostic methodsdescribed herein. For example, a cDNA of a mutant glcNAc-TV-b gene maybe isolated using PCR as described herein, and the DNA sequence of themutant allele may be compared to the normal allele to ascertain themutation(s) responsible for the loss or alteration of function of themutant gene product. A genomic library can also be constructed using DNAfrom an individual suspected of or known to carry a mutant allele, or acDNA library can be constructed using RNA from tissue known, orsuspected to express the mutant allele. A nucleic acid encoding a normalglcNAc-TV-b gene or any suitable fragment thereof, may then be labeledand used as a probe to identify the corresponding mutant allele in suchlibraries. Clones containing mutant sequences can be purified andsubjected to sequence analysis. In addition, an expression library canbe constructed using cDNA from RNA isolated from a tissue of anindividual known or suspected to express a mutant glcNAc-TV-b allele.Gene products from putatively mutant tissue may be expressed andscreened, for example using antibodies specific for a GlcNAc-TV-bProtein or a GlcNAc-TV-b Related Protein as described herein. Libraryclones identified using the antibodies can be purified and subjected tosequence analysis.

[0075] Antisense molecules and ribozymes are contemplated within thescope of the invention. “Antisense refers to any composition containingnucleotide sequences which are complementary to a specific DNA or RNAsequence. Ribozymes are enzymatic RNA molecules that can be used tocatalyze the specific cleavage of RNA. Antisense molecules and ribozymesmay be prepared by any method known in the art for the synthesis ofnucleic acid molecules. These include techniques for chemicallysynthesizing oligonucleotides such as solid phase phosphoramiditechemical synthesis. Alternatively, RNA molecules may be generated by invitro and in vivo transcription of DNA sequences encoding a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, orGlcNAc-TV-c Related Protein. Such DNA sequences may be incorporated intoa wide variety of vectors with suitable RNA polymerase promoters such asT7 or SP6. Alternatively, these cDNA constructs that synthesizeantisense RNA constitutively or inducibly can be introduced into celllines, cells, or tissues. RNA molecules may be modified to increaseintracellular stability and half-life. Possible modifications include,but are not limited to, the addition of flanking sequences at the 5′and/or 3′ ends of the molecule or the use of phosphorothioate or 2′O-methyl rather than phosphodiesterase linkages within the backbone ofthe molecule. This concept is inherent in the production of PNAs and canbe extended in all of these molecules by the inclusion of nontraditionalbases such as in sine, queosine, and wybutosine, as well as acetyl-,methyl-, thio-, and similarly modified forms of adenine, cytidine,guanine, thymine, and uridine which are not as easily recognized byendogenous endonucleases.

Proteins of the Invention

[0076] The proteins of the invention are predominantly expressed in thecentral nervous system, with the exception of the spinal cord. Theproteins are also expressed in different tumors such as cervicalcarcinoma, lung carcinoma, colon carcinoma, melanoma, and they have beenspecifically found in tumors from the breast and uterus.

[0077] The amino acid sequence of an isolated GlcNAc-TV-b Protein of theinvention comprises a sequence of SEQ.ID. NO. 2, 4, or 6. The amino acidsequence of an isolated GlcNAc-TV-c Protein of the invention comprises asequence of SEQ.ID. NO.2, 10, or 12. In addition to proteins comprisingan amino acid sequence of SEQ.ID. NO. 2, 4, 6, 10, or 12 the proteins ofthe present invention include truncations, and analogs, allelic andspecies variations, and homologs of GlcNAc-TV-b or GlcNAc-TV-c andtruncations thereof as described herein (i.e. GlcNAc-TV-b RelatedProteins or GlcNAc-TV-c Related Proteins).

[0078] Truncated proteins may comprise peptides of between 3 and 70amino acid residues, ranging in size from a tripeptide to a 70 merpolypeptide, preferably 12 to 20 amino acids. In one aspect of theinvention, fragments of a GlcNAc-TV-b or GlcNAc-TV-c protein areprovided having an amino acid sequence of at least five consecutiveamino acids of SEQ.ID. NO. 2, 4, 6, 10, or 12 where no amino acidsequence of five or more, six or more, seven or more, or eight or more,consecutive amino acids present in the fragment is present in a proteinother than GlcNAc-TV-b or GlcNAc-TV-c. In an embodiment of the inventionthe fragment is a stretch of amino acid residues of at least 12 to 20contiguous amino acids from particular sequences such as the sequencesof SEQ.ID. NO. 2, 4, 6, 10, or 12. The fragments may be immunogenic andpreferably are not immunoreactive with antibodies that areimmunoreactive to proteins other than GlcNAc-TV-b or GlcNAc-TV-c.

[0079] The truncated proteins may have an amino group (—NH2), ahydrophobic group (for example, carbobenzoxyl, dansyl, orT-butyloxycarbonyl), an acetyl group, a 9-fluorenylmethoxy-carbonyl(PMOC) group, or a macromolecule including but not limited tolipid-fatty acid conjugates, polyethylene glycol, or carbohydrates atthe amino terminal end. The truncated proteins may have a carboxylgroup, an amido group, a T-butyloxycarbonyl group, or a macromoleculeincluding but not limited to lipid-fatty acid conjugates, polyethyleneglycol, or carbohydrates at the carboxy terminal end.

[0080] The proteins of the invention may also include analogs ofGlcNAc-TV-b or GlcNAc-TV-c, and/or truncations thereof as describedherein, which may include, but are not limited to GlcNAc-TV-b orGlcNAc-TV-c, containing one or more amino acid substitutions,insertions, and/or deletions. Amino acid substitutions may be of aconserved or non-conserved nature. Conserved amino acid substitutionsinvolve replacing one or more amino acids of the GlcNAc-TV-b orGlcNAc-TV-c amino acid sequence with amino acids of similar charge,size, and/or hydrophobicity characteristics. When only conservedsubstitutions are made the resulting analog is preferably functionallyequivalent to GlcNAc-TV-b or GlcNAc-TV-c. Non-conserved substitutionsinvolve replacing one or more amino acids of the GlcNAc-TV-b orGlcNAc-TV-c amino acid sequence with one or more amino acids whichpossess dissimilar charge, size, and/or hydrophobicity characteristics.

[0081] One or more aminp acid insertions may be introduced into aGlcNAc-TV-b Protein or GlcNAc-TV-c Protein. Amino acid insertions mayconsist of single amino acid residues or sequential amino acids rangingfrom 2 to 15 amino acids in length.

[0082] Deletions may consist of the removal of one or more amino acids,or discrete portions from the GlcNAc-TV-b or GlcNAc-TV-c amino acidsequence. The deleted amino acids may or may not be contiguous. Thelower limit length of the resulting analog with a deletion mutation isabout 10 amino acids, preferably 100 amino acids.

[0083] An allelic variant of GlcNAc-TV-b or GlcNAc-TV-c at the proteinlevel differs from one another by only one, or at most, a few amino acidsubstitutions. A species variation of a GlcNAc-TV-b Protein orGlcNAc-TV-c Protein is a variation which is naturally occurring amongdifferent species of an organism.

[0084] The proteins of the invention also include homologs ofGlcNAc-TV-b or GlcNAc-TV-c and/or truncations thereof as describedherein. Such GlcNAc-TV-b or GlcNAc-TV-c homologs include proteins whoseamino acid sequences are comprised of the amino acid sequences ofGlcNAc-TV-b or GlcNAc-TV-c regions from other species that hybridizeunder selective hybridization conditions (see discussion of selectiveand in particular stringent hybridization conditions herein) with aprobe used to obtain a GlcNAc-TV-b Protein or GlcNAc-TV-c Protein. Thesehomologs will generally have the same regions which are characteristicof a GlcNAc-TV-b or GlcNAc-TV-c Protein. It is anticipated that aprotein comprising an amino acid sequence which has at least 70%identity, more preferably at least 75% identity, most preferably 80 to90% identity, with an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10,or 12 will be a homolog of a protein of the invention. A percent aminoacid sequence homology or identity is calculated using the methodsdescribed herein, preferably the computer programs described herein.

[0085] The invention also contemplates isoforms of the proteins of theinvention. An isoform contains the same number and kinds of amino acidsas the protein of the invention, but the isoform has a differentmolecular structure. The isoforms contemplated by the present inventionpreferably have the same properties as the protein of the invention asdescribed herein.

[0086] The present invention also includes GlcNAc-TV-b Proteins,GlcNAc-TV-b Related Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-cRelated Proteins conjugated with a selected protein, or a marker protein(see below), or other glycosyltransferase, to produce fusion proteins orchimeric proteins.

[0087] A GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, aGlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the inventionmay be prepared using recombinant DNA methods. Accordingly, the nucleicacids of the present invention having a sequence which encodes aGlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, a GlcNAc-TV-cProtein, or a GlcNAc-TV-c Related Protein of the invention may beincorporated in a known manner into an appropriate expression vectorwhich ensures good expression of the protein. Possible expressionvectors include but are not limited to cosmids, plasmids, or modifiedviruses (e.g. replication defective retroviruses, adenoviruses andadeno-associated viruses), so long as the vector is compatible with thehost cell used.

[0088] The invention therefore contemplates a recombinant expressionvector of the invention containing a nucleic acid molecule of theinvention, and the necessary regulatory sequences for the transcriptionand translation of the inserted protein-sequence. Suitable regulatorysequences may be derived from a variety of sources, including bacterial,fungal, viral, mammalian, or insect genes (For example, see theregulatory sequences described in Goeddel, Gene Expression Technology:Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990).Selection of appropriate regulatory sequences is dependent on the hostcell chosen as discussed below, and may be readily accomplished by oneof ordinary skill in the art. The necessary regulatory sequences may besupplied by the native GlcNAc-TV Protein and/or its flanking regions.

[0089] The invention further provides a recombinant expression vectorcomprising a nucleic acid molecule of the invention cloned into theexpression vector in an antisense orientation. That is, the DNA moleculeis linked to a regulatory sequence in a manner which allows forexpression, by transcription of the DNA molecule, of an RNA moleculewhich is antisense to a nucleic acid sequence of SEQ. ID. NO. 1, 3, 5,7, 8, 9, or 11. Regulatory sequences linked to the antisense nucleicacid can be chosen which direct the continuous expression of theantisense RNA molecule in a variety of cell types, for instance a viralpromoter and/or enhancer, or regulatory sequences can be chosen whichdirect tissue or cell type specific expression of antisense RNA.

[0090] The recombinant expression vectors of the invention may alsocontain a marker gene which facilitates the selection of host cellstransformed or transfected with a recombinant molecule of the invention.Examples of marker genes are genes encoding a protein such as G418,dhfr, npt, als, pat and hygromycin which confer resistance to certaindrugs, β-galactosidase, chloramphenicol acetyltransferase, fireflyluciferase, trpB, hisD, herpes simplex virus thymidine kinase, adeninephosphoribosyl transferase, or an immunoglobulin or portion thereof suchas the Fc portion of an immunoglobulin preferably IgG. Visible markerssuch as anthocyanins, beta-glucuronidase and its substrate GUS, andluciferase and its substrate luciferin, can be used to identifytransformants, and also to quantify the amount of transient or stableprotein expression attributable to a specific vector system (Rhodes, C.et al. (1995)Mol. Biol. 55:121-131). The markers can be introduced on aseparate vector from the nucleic acid of interest.

[0091] The recombinant expression vectors may also contain genes thatencode a fusion moiety which provides increased expression of therecombinant protein; increased solubility of the recombinant protein;and aid in the purification of the target recombinant protein by actingas a ligand in affinity purification. For example, a proteolyticcleavage site may be added to the target recombinant protein to allowseparation of the recombinant protein from the fusion moiety subsequentto purification of the fusion protein. Typical fusion expression vectorsinclude PGEX (Amrad Corp., Melbourne, Australia), pMAL (New EnglandBiolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) whichfuse glutathione S-transferase (GST), maltose E binding protein, orprotein A, respectively, to the recombinant protein.

[0092] The vectors may be introduced into host cells to produce atransformed or transfected host cell. The terms “transfected” and“transfection” encompass the introduction of nucleic acid (e.g. avector) into a cell by one of many standard techniques. A cell is“transformed” by a nucleic acid when the transfected nucleic acideffects a phenotypic change. Prokaryotic cells can be transfected ortransformed with nucleic acid by, for example, electroporation orcalcium-chloride mediated transformation. Nucleic acid can be introducedinto mammalian cells via conventional techniques such as calciumphosphate or calcium chloride co-precipitation, DEAE-dextran-mediatedtransfection, lipofectin, electroporation or microinjection. Suitablemethods for transforming and transfecting host cells can be found inSambrook et al. (Molecular Cloning: A Laboratory Manual, 2nd Edition,Cold Spring Harbor Laboratory press (1989)), and other laboratorytextbooks.

[0093] Human artificial chromosomes (HACs) may also be employed todeliver larger fragments of DNA that can be contained and expressed in aplasmid. HACs of 6 to 10M are constructed and delivered via conventionaldelivery methods (liposomes, polycationic amino polymers, or vesicles)for therapeutic purposes.

[0094] Suitable host cells include a wide variety of prokaryotic andeukaryotic host cells. For example, the proteins of the invention may beexpressed in bacterial cells such as E. coli, insect cells (usingbaculovirus), yeast cells, or mammalian cells. Other suitable host cellscan be found in Goeddel, Gene Expression Technology: Methods inEnzymology 185, Academic Press, San Diego, Calif. (1991).

[0095] A host cell may also be chosen which modulates the expression ofan inserted nucleic acid sequence, or modifies (e.g. glycosylation orphosphorylation) and processes (e.g. cleaves) the protein in a desiredfashion. Host systems or cell lines may be selected which have specificand characteristic mechanisms for post-translational processing andmodification of proteins. For example, eukaryotic host cells includingCHO, VERO, BHK, A431, HeLA, COS, MDCK, 293, 3T3, and W138 may be used.For long-term high-yield stable expression of the protein, cell linesand host systems which stably express the gene product may beengineered.

[0096] Host cells and in particular cell lines produced using themethods described herein may be particularly useful in screening andevaluating compounds that modulate the activity of a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or aGlcNAc-TV-c Related Protein.

[0097] The proteins of the invention may also be expressed in non-humantransgenic animals including but not limited to mice, rats, rabbits,guinea pigs, micro-pigs, goats, sheep, pigs, non-human primates (e.g.baboons, monkeys, and chimpanzees) (see Hammer et al. (Nature315:680-683, 1985), Palmiter et al. (Science 222:809-814, 1983),Brinster et al. (Proc Natl. Acad. Sci USA 82:44384442, 1985), Palmiterand Brinster (Cell. 41:343-345, 1985) and U.S. Pat. No. 4,736,866).Procedures known in the art may be used to introduce a nucleic acidmolecule of the invention encoding a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Proteininto animals to produce the founder lines of transgenic animals. Suchprocedures include pronuclear microinjection, retrovirus mediated genetransfer into germ lines, gene targeting in embryonic stem cells,electroporation of embryos, and sperm-mediated gene transfer.

[0098] The present invention contemplates a transgenic animal thatcarries the GlcNAc-TV-b or GlcNAc-TV-c gene in all their cells, andanimals which carry the transgene in some but not all their cells. Thetransgene may be integrated as a single transgene or in concatamers. Thetransgene may be selectively introduced into and activated in specificcell types (See for example, Lasko et al, 1992 Proc. Natl. Acad. Sci.USA 89: 6236). The transgene may be integrated into the chromosomal siteof the endogenous gene by gene targeting. The transgene may beselectively introduced into a particular cell type inactivating theendogenous gene in that cell type (See Gu et al Science 265: 103-106).

[0099] The expression of a recombinant GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein ina transgenic animal may be assayed using standard techniques. Initialscreening may be conducted by Southern Blot analysis, or PCR methods toanalyze whether the transgene has been integrated. The level of MRNAexpression in the tissues of transgenic animals may also be assessedusing techniques including Northern blot analysis of tissue samples, insitu hybridization, and RT-PCR. Tissue may also be evaluatedimmunocytochemically using antibodies against a GlcNAc-TV-b Protein orGlcNAc-TV-c Protein of the invention.

[0100] Proteins of the invention may also be prepared by chemicalsynthesis using techniques well known in the chemistry of proteins suchas solid phase synthesis (Merrifield, 1964, J. Am. Chem. Assoc.85:2149-2154) or synthesis in homogenous solution (Houbenweyl, 1987,Methods of Organic Chemistry, ed. E. Wansch, Vol. 15 I and II, Thieme,Stuttgart). Protein synthesis may be performed using manual proceduresor by automation. Automated synthesis may be carried out, for example,using an Applied Biosystems 431A peptide synthesizer (Perkin Elmer).Various fragments of the proteins of the invention may be chemicallysynthesized separately and combined using chemical methods to producethe full length molecule.

[0101] N-terminal or C-terminal fusion proteins or chimeric proteinscomprising a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, aGlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein of the inventionconjugated with other molecules, such as proteins (e.g. markers or otherglycosyltransferases) may be prepared by fusing, through recombinanttechniques, the N-terminal or C-terminal of a GlcNAc-TV-b Protein, aGlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-cRelated Protein, and the sequence of a selected protein or markerprotein with a desired biological function. The resultant fusionproteins contain a GlcNAc-TV-b Protein, a GlcNAc-TV-b Related Protein, aGlcNAc-TV-c Protein, or a GlcNAc-TV-c Related Protein fused to theselected protein or marker protein as described herein. Examples ofproteins which may be used to prepare fusion proteins includeimmunoglobulins, glutathione-S-transferase (GST), protein A,hemagglutinin (HA), and truncated myc.

Antibodies

[0102] A protein of the invention, or a portion thereof can be used toprepare antibodies specific for the proteins. Antibodies can be preparedwhich bind a distinct epitope in an unconserved region of the protein.An unconserved region of the protein is one that does not havesubstantial sequence homology to other proteins. A region from aconserved region such as a well-characterized domain can also be used toprepare an antibody to a conserved region of a protein of the invention

[0103] In an embodiment of the invention, oligopeptides, peptides, orfragments used to induce antibodies to a protein of the invention havean amino acid sequence consisting of at least 5 amino acids and morepreferably at least 10 amino acids. The oligopeptides, etc. can beidentical to a portion of the amino acid sequence of the naturalprotein, and they may contain the entire amino acid sequence of a small,naturally occurring molecule. Antibodies having specificity for aprotein of the invention may also be raised from fusion proteins createdby expressing fusion proteins in bacteria as described herein.

[0104] The invention can employ intact monoclonal or polyclonalantibodies, and immunologically active fragments (e.g. a Fab or (Fab)₂fragment), an antibody heavy chain, an antibody light chain, agenetically engineered single chain F_(V) molecule (Ladner et al, U.S.Pat. No. 4,946,778), or a chimeric antibody, for example, an antibodywhich contains the binding specificity of a murine antibody, but inwhich the remaining portions are of human origin. Antibodies includingmonoclonal and polyclonal antibodies, fragments and chimeras, etc. maybe prepared using methods known to those skilled in the art.

Applications of the Nucleic Acid Molecules, Proteins, and Antibodies ofthe Invention

[0105] The nucleic acid molecules, GlcNAc-TV-b Proteins, GlcNAc-TV-bRelated Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-c Related Proteins,and antibodies of the invention may be used in the prognostic anddiagnostic evaluation of conditions requiring modulation of a nucleicacid or protein of the invention including cancer, and theidentification of subjects with a predisposition to such conditions (Seebelow). Methods for detecting nucleic acid molecules and proteins of theinvention, can be used to monitor conditions requiring modulation of thenucleic acids or proteins including cancer (e.g. solid tumors, such asbreast and uterine cancer) by detecting and localizing the proteins andnucleic acids. It would also be apparent to one skilled in the art thatthe methods described herein may be used to study the developmentalexpression of the proteins of the invention and, accordingly, willprovide further insight into the role of the proteins. The applicationsof the present invention also include methods for the identification ofcompounds which modulate the biological activity of a protein of theinvention (See below). The compounds, antibodies, etc. may be used forthe treatment of conditions requiring modulation of proteins of theinvention including cancer (e.g. solid tumors, such as breast anduterine cancer). (See below).

Diagnostic Methods

[0106] A variety of methods can be employed for the diagnostic andprognostic evaluation of conditions requiring modulation of a nucleicacid or protein of the invention including cancer (e.g. solid tumors,breast and uterine cancer), and the identification of subjects with apredisposition to such conditions. Such methods may, for example,utilize nucleic acid molecules of the invention, and fragments thereof,and antibodies directed against proteins of the invention, includingpeptide fragments. In particular, the nucleic acids and antibodies maybe used, for example, for: (1) the detection of the presence ofglcNAc-TV-b or glcNAc-TV-c mutations, or the detection of either over-or under-expression of GlcNAc-TV-b or GlcNAc-TV-c mRNA relative to anon-disorder state or the qualitative or quantitative detection ofalternatively spliced forms of glcNAc-TV-b or glcNAc-TV-c transcriptswhich may correlate with certain conditions or susceptibility towardsuch conditions; and (2) the detection of either an over- or anunder-abundance of a protein of the invention relative to a non-disorderstate or the presence of a modified (e.g., less than full length)protein of the invention which correlates with a disorder state, or aprogression toward a disorder state.

[0107] The methods described herein may be performed by utilizingpre-packaged diagnostic kits comprising at least one specific nucleicacid or antibody described herein, which may be conveniently used, e.g.,in clinical settings, to screen and diagnose patients and to screen andidentify those individuals exhibiting a predisposition to developing adisorder.

[0108] Nucleic acid-based detection techniques and peptide detectiontechniques are described below. The samples that may be analyzed usingthe methods of the invention include those which are known or suspectedto express glcNAc-TV-b or glcNAc-TV-c or contain a protein of theinvention. The methods may be performed on biological samples includingbut not limited to cells, lysates of cells which have been incubated incell culture, chromosomes isolated from a cell (e.g. a spread ofmetaphase chromosomes), genomic DNA (in solutions or bound to a solidsupport such as for Southern analysis), RNA (in solution or bound to asolid support such as for northern analysis), cDNA (in solution or boundto a solid suppon), an extract from cells or a tissue, and biologicalfluids such as serum, urine, blood, and CSF. The samples may be derivedfrom a patient or a culture.

Methods for Detecting Nucleic Acid Molecules of the Invention

[0109] A nucleic acid molecule encoding a protein of the invention maybe used in Southern or northern analysis, dot blot, or othermembrane-based technologies; in PCR technologies; or in dipstick, pin,ELISA assays or microarrays utilizing fluids or tissues from patientbiopsies to detect altered expression. Such qualitative or quantitativemethods are well known in the art and some methods are described below.

[0110] The nucleic acid molecules of the invention allow those skilledin the art to construct nucleotide probes for use in the detection ofnucleic acid sequences of the invention in biological materials.Suitable probes include nucleic acid molecules based on nucleic acidsequences encoding at least 5 sequential amino acids from regions of theGlcNAc-TV-b or GlcNAc-TV-c nucleic acid molecules (see SEQ. ID. No. 1,3, 5, 7, 8, 9, or 11), preferably they comprise 15 to 30 nucleotides. Anucleotide probe may be labeled with a detectable substance such as aradioactive label which provides for an adequate signal and hassufficient half-life such as ³²p, ³H, ¹⁴C or the like. Other detectablesubstances which may be used include antigens that are recognized by aspecific labeled antibody, fluorescent compounds, enzymes, antibodiesspecific for a labeled antigen, and luminescent compounds. Anappropriate label may be selected having regard to the rate ofhybridization and binding of the probe to the nucleotide to be detectedand the amount of nucleotide available for hybridization. Labeled probesmay be hybridized to nucleic acids on solid supports such asnitrocellulose filters or nylon membranes as generally described inSambrook et al, 1989, Molecular Cloning, A Laboratory Manual (2nd ed.).The nucleic acid probes may be used to detect glcNAc-TV-b or GlcNAc-TV-cgenes, preferably in human cells. The nucleotide probes may also beuseful for example in the diagnosis or prognosis of cancer, the stagingof the cancer, and in monitoring the progression of these conditions, ormonitoring a therapeutic treatment. The probes may also be useful formapping the naturally occurring genomic sequence. Sequences can bemapped to a particular chromosome, to a specific region of a chromosome,or to an artificial chromosome construction (e.g. HACs, yest artificialchromosomes (YACs), bacterial artificial chromosomes (BACs) bacterial P1constructions or single chromosome cDNA libraries (see Price, C. M.1993, Blood Rev. 7:127-1134 and Trask, B. J. 1991, Trends Genet.7;149-154).

[0111] The probe may be used in hybridization techniques to detectglcNAc-TV-b or glcNAc-TV-c genes. The technique generally involvescontacting and incubating nucleic acids (e.g. recombinant DNA molecules,cloned genes) obtained from a sample from a patient or other cellularsource with a probe of the present invention under conditions favourablefor the specific annealing of the probes to complementary sequences inthe nucleic acids. After incubation, the non-annealed nucleic acids areremoved, and the presence of nucleic acids that have hybridized to theprobe if any are detected.

[0112] The detection of nucleic acid molecules of the invention mayinvolve the amplification of specific gene sequences using anamplification method such as PCR, followed by the analysis of theamplified molecules using techniques known to those skilled in the art.Suitable primers can be routinely designed by one of skill in the art.

[0113] Genomic DNA may be used in hybridization or amplification assaysof biological samples to detect abnormalities involving glcNAc-TV-b orglcNAc-TV-c structure, including point mutations, insertions, deletions,and chromosomal rearrangements. For example, direct sequencing, singlestranded conformational polymorphism analyses, heteroduplex analysis,denaturing gradient gel electrophoresis, chemical mismatch cleavage, andoligonucleotide hybridization may be utilized.

[0114] Genotyping techniques known to one skilled in the art can be usedto type polymorphisms that are in close proximity to the mutations in aglcNAc-TV-b or glcNAc-TV-c gene. The polymorphisms may be used toidentify individuals in families that are likely to carry mutations. Ifa polymorphism exhibits linkage disequalibrium with mutations in theglcNAc-TV-b or glcNAc-TV-c genes, it can also be used to screen forindividuals in the general population likely to carry mutations.Polymorphisms which may be used include restriction fragment lengthpolymorphisms (RFLPs), single-base polymorphisms, and simple sequencerepeat polymorphisms (SSLPs).

[0115] A probe of the invention may be used to directly identify RFLPs.A probe or primer of the invention can additionally be used to isolategenomic clones such as YACs, BACs, PACs, cosmids, phage or plasmids. TheDNA in the clones can be screened for SSLPs using hybridization orsequencing procedures.

[0116] Hybridization and amplification techniques described herein maybe used to assay qualitative and quantitative aspects of glcNAc-TV-b orglcNAc-TV-c expression. For example, RNA may be isolated from a celltype or tissue known to express glcNAc-TV-b (e.g. brain) and testedutilizing the hybridization (e.g. standard Northern analyses) or PCRtechniques referred to herein. The techniques may be used to detectdifferences in transcript size which may be due to normal or abnormalalternative splicing. The techniques may be used to detect quantitativedifferences between levels of full length and/or alternatively splicetranscripts detected in normal individuals relative to those individualsexhibiting symptoms of a disease such as cancer.

[0117] The primers and probes may be used in the above described methodsin situ i.e directly on tissue sections (fixed and/or frozen) of patienttissue obtained from biopsies or resections.

Microarrays

[0118] Oligonucleotides derived from any of the nucleic acid moleculesof the invention may be used as targets in microarrays. “Microarray”refers to an array of distinct polynucleotides or oligonucleotidessynthesized on a substrate, such as paper, nylon, or other type ofmembrane, filter, chip, glass slide, or any other suitable solidsupport.

[0119] The microarrays can be used to monitor the expression level oflarge numbers of genes simultaneously (to produce a transcript image)and to identify genetic variants, mutations, and polymorphisms. Thisinformation can be useful in determining gene function, understandingthe genetic basis of disease, diagnosing disease, and in developing andmonitoring the activity of therapeutic agents (Heller, R. et al. (1997)Proc. Natl. Acad, Sci. 94:2150-55).

[0120] In an embodiment of the invention, the microarray is prepared andused according to the methods described in PCT application WO95/11995(Chee et al), Lockhart D. J. et al, 1996, Nat. Biotech. 14:1675-1680)and Schena M. et al 1996, Proc. Natl. Acad, Sci. 93: 10614-10619).

[0121] The microarray can be composed of a large number of unique,single-stranded nucleic acid sequences, usually either syntheticantisense oligonucleotides or fragments of cDNAs fixed to a solidsupport. The oligonucleotides can be about 6-60 nucleotides in length,preferably 15-30 nucleotides in length, and most preferably about 20-25nucleotides in length. For some microarrays it may be preferred to useoligonucleotides which are about 7-10 nucleotides in length. Themicroarray can contain oligonucleotides covering the known 5′ or 3′sequence, sequential oligonucleotides covering the full length sequence,or unique oligonucleotides selected from particular areas along thelength of the sequence. Polynucleotides used in the microarray can beoligonucleotides specific to a gene(s) of interest in which at least afragment of the sequence is known or that are specific to one or moreunidentified cDNAs which are common to particular cell types, ordevelopmental or disease state.

[0122] To produce oligonucleotides to a known sequence for a microarray,a gene of interest is examined using a computer algorithm which startsat the 5′ or more preferably at the 3′ end of the nucleotide sequence.The algorithm identifies oligomers of a defined length that are uniqueto the gene, have a GC content within a suitable range forhybridization, and lack predicted secondary structure that can interferewith hybridization. In some cases it may be appropriate to use pairs ofoligonucleotides on a microarray. The “pairs” will be identical, exceptfor a single nucleotide which can be located in the center of thesequence. The second oligonucleotide in the pair serves as a control.The number of oligonucleotide pairs may range from two to one million.The oligomers are synthesized at designated areas on a substrate using alight-directed chemical process.

[0123] The oligomers can be synthesized on the surface of the substrateby using a chemical coupling procedure and an ink jet applicationapparatus, such as described in PCT application WO95/251116(Baldeschweiler et al.). A “gridded” array analog us to a dot (or slot)blot can also be used to arrange and link cDNA fragments oroligonucleotides to the surface of a substrate using a vacuum system,thermal, UV, mechanical or chemical bonding procedures. An array can beproduced by hand or using available devices (slot blot or dot blotapparatus), materials (any suitable solid support), and machines(including robotic instruments) and it can contain 8, 24, 96, 384, 1536or 6144 oligonucleotides, or any other multiple between two and onemillion which lends itself to the efficient use of commerciallyavailable instrumentation.

[0124] Sample analysis using microarrays, is conducted by making RNA orDNA from a biological sample into hybridization probes. The mRNA isisolated, and cDNA is prepared and used as a template to make antisenseRNA (aRNA). The aRNA is amplified in the presence of fluorescentnucleotides, and labeled hybridization probes are incubated with themicroarray so that the probe sequences hybridize to complementaryoligonucleotides of the microarray. Incubation conditions are selectedso that hybridization occurs with precise complementary matches or withvarious degrees of less complementarity. After removal of nonhybridizedprobes, a scanner determines the levels and patterns of fluorescence.The scanned images are examined to determine the degree ofcomplementarity and the relative quantity of each oligonucleotidesequence on the microarray. The biological samples may be obtained fromany bodily fluids (such as blood, urine, saliva, phlegm, gastric juices,etc.), cultured cells, biopsies, or other tissue preparations. Adetection system can be used to measure the absence, presence, andamount of hybridization for all of the distinct sequencessimultaneously. This data can be used for large scale correlationstudies on the sequences, mutations, variants, or polymorphisms amongsamples.

Methods for Detecting Proteins

[0125] Antibodies specifically reactive with a GlcNAc-TV-b Protein, aGlcNAc-TV-b Related Protein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-cRelated Protein, or derivatives, such as enzyme conjugates or labeledderivatives, may be used to detect GlcNAc-TV-b Proteins, GlcNAc-TV-bRelated Proteins, GlcNAc-TV-c Proteins, or GlcNAc-TV-c Related Proteinsin various biological materials. They may be used as diagnostic orprognostic reagents and they may be used to detect abnormalities in thelevel of GlcNAc-TV-b Proteins, GlcNAc-TV-b Related Proteins, GlcNAc-TV-cProteins, or GlcNAc-TV-c Related Proteins, expression, or abnormalitiesin the structure, and/or temporal, tissue, cellular, or subcellularlocation of the proteins. Antibodies may also be used to screenpotentially therapeutic compounds in vitro to determine their effects ona condition such as cancer etc. In vitro immunoassays may also be usedto assess or monitor the efficacy of particular therapies. Theantibodies of the invention may also be used in vitro to determine thelevel of GlcNAc-TV-b or GlcNAc-TV-c expression in cells geneticallyengineered to produce a GlcNAc-TV-b Protein, a GlcNAc-TV-b RelatedProtein, a GlcNAc-TV-c Protein, or a GlcNAc-TV-b Related Protein.

[0126] The antibodies may be used in any known immunoassays which relyon the binding interaction between an antigenic determinant of a proteinof the invention, and the antibodies. Examples of such assays areradioimmunoassays, enzyme immunoassays (e.g. ELISA), immun fluorescence,immun precipitation, latex agglutination, hemagglutination, andhistochemical tests. The antibodies may be used to detect and quantifyproteins of the invention in a sample in order to determine its role inparticular cellular events or path logical states, and to diagnose andtreat such path logical states.

[0127] In particular, the antibodies of the invention may be used inimmuno-histochemical analyses, for example, at the cellular andsub-subcellular level, to detect a protein of the invention, to localiseit to particular cells and tissues, and to specific subcellularlocations, and to quantitate the level of expression.

[0128] Cytochemical techniques known in the art for localizing antigensusing light and electron microscopy may be used to detect a protein ofthe invention. Generally, an antibody of the invention may be labeledwith a detectable substance and a protein may be localised in tissuesand cells based upon the presence of the detectable substance. Variousmethods of labeling polypeptides and glycoproteins are known in the artand may be used. Examples of detectable substances include, but are notlimited to, the following: radioisotopes (e.g., ³H, ¹⁴C, ³⁵S, ¹²⁵I,¹³¹I), fluorescent labels (e.g., FITC, rhodamine, lanthanide phosphors),luminescent labels such as luminol; enzymatic labels (e.g., horseradishperoxidase, β-galactosidase, luciferase, alkaline phosphatase,acetylcholinesterase), biotinyl groups (which can be detected by markedavidin e.g., streptavidin containing a fluorescent marker or enzymaticactivity that can be detected by optical or calorimetric methods), andpredetermined polypeptide epitopes recognized by a secondary reporter(e.g., leucine zipper pair sequences, binding sites for secondaryantibodies, metal binding domains, epitope tags). In some embodiments,labels are attached via spacer arms of various lengths to reducepotential steric hindrance. Antibodies may also be coupled to electrondense substances, such as ferritin or colloidal gold, which are readilyvisualised by electron microscopy.

[0129] The antibody or sample may be immobilized on a carrier or solidsupport which is capable of immobilizing cells, antibodies etc. Forexample, the carrier or support may be nitrocellulose, or glass,polyacrylamides, gabbros, and magnetite. The support material may haveany possible configuration including spherical (e.g. bead), cylindrical(e.g. inside surface of a test tube or well, or the external surface ofa rod), or flat (e.g. sheet, test strip). Indirect methods may also beemployed in which the primary antigen-antibody reaction is amplified bythe introduction of a second antibody, having specificity for theantibody reactive against a protein of the invention. By way of example,if the antibody having specificity against a protein of the invention isa rabbit IgG antibody, the second antibody may be goat anti-rabbitgamma-globulin labelled with a detectable substance as described herein.

[0130] Where a radioactive label is used as a detectable substance, aprotein of the invention may be localized by radioautography. Theresults of radioautography may be quantitated by determining the densityof particles in the radioautographs by various optical methods, or bycounting the grains.

Methods for Identifying or Evaluating Substances/Compounds

[0131] The methods described herein are designed to identify substancesthat modulate the biological activity of a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein including substances that interfere with, or enhance theactivity of a GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein,GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

[0132] The substances and compounds identified using the methods of theinvention include but are not limited to peptides such as solublepeptides including Ig-tailed fusion peptides, members of random peptidelibraries and combinatorial chemistry-derived molecular librariesincluding libraries made of D- and/or L-configuration amino acids,phosphopeptides (including members of random or partially degenerate,directed phosphopeptide libraries), antibodies [e.g. polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, single chainantibodies, fragments, (e.g. Fab, F(ab)₂, and Fab expression libraryfragments, and epitope-binding fragments thereof)], and small organic orinorganic molecules. The substance or compound may be an endogenousphysiological compound or it may be a natural or synthetic compound.

[0133] Substances which modulate a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein canbe identified based on their ability to associate with (or bind to) aGlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein. Therefore, the invention also providesmethods for identifying substances which associate with a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, orGlcNAc-TV-c Related Protein. Substances identified using the methods ofthe invention may be isolated, cloned and sequenced using conventionaltechniques. A substance that associates with a protein of the inventionmay be an agonist or antagonist of the biological or immunologicalactivity of a polypeptide of the invention.

[0134] The term “agonist”, refers to a molecule that increases theamount of, or prolongs the duration of, the activity of the protein. Theterm “antagonist” refers to a molecule which decreases the biological orimmunological activity of the protein. Agonists and antagonists mayinclude proteins, nucleic acids, carbohydrates, or any other moleculesthat associate with a polypeptide of the invention.

[0135] Substances which can associate with a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein may be identified by reacting a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Proteinwith a test substance which potentially associates with a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, orGlcNAc-TV-c Related Protein, under conditions which permit theassociation, and removing and/or detecting GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein associated with the test substance. Substance-protein complexes,free substance, or non-complexed protein may be assayed. Conditionswhich permit the formation of substance-protein complexes may beselected having regard to factors such as the nature and amounts of thesubstance and the protein.

[0136] The substance-protein complex, free substance or non-complexedproteins may be isolated by conventional isolation techniques, forexample, salting out, chromatography, electrophoresis, gel filtration,fractionation, absorption, polyacrylamide gel electrophoresis,agglutination, or combinations thereof. To facilitate the assay of thecomponents, antibody against a protein of the invention or thesubstance, or labeled protein, or a labeled substance may be utilized.The antibodies, proteins, or substances may be labeled with a detectablesubstance as described above.

[0137] A GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-cProtein, or GlcNAc-TV-c Related Protein, or the substance used in themethod of the invention may be insolubilized. For example, a protein, orsubstance may be bound to a suitable carrier such as agarese, cellulose,dextran, Sephadex, Sepharose, carboxymethyl cellulose polystyrene,filter paper, ion-exchange resin, plastic film, plastic tube, glassbeads, polyamine-methyl vinyl-ether-maleic acid copolymer, amino acidcopolymer, ethylene-maleic acid copolymer, nylon, silk, etc. The carriermay be in the shape of, for example, a tube, test plate, beads, disc,sphere etc. The insolubilized protein or substance may be prepared byreacting the material with a suitable insoluble carrier using knownchemical or physical methods, for example, cyanogen bromide coupling.

[0138] The invention also contemplates a method for evaluating acompound for its ability to modulate the biological activity of aprotein of the invention, by assaying for an agonist or antagonist (i.e.enhancer or inhibitor) of the association of the protein with asubstance which associates with the protein. The basic method forevaluating if a compound is an agonist or antagonist of the associationof a protein of the invention and a substance that associates with theprotein, is to prepare a reaction mixture containing the protein and thesubstance under conditions which permit the formation ofsubstance-protein complexes, in the presence of a test compound. Thetest compound may be initially added to the mixture, or may be addedsubsequent to the addition of the protein and substance. Controlreaction mixtures without the test compound or with a placebo are alsoprepared. The formation of complexes is detected and the formation ofcomplexes in the control reaction but not in the reaction mixtureindicates that the test compound interferes with the interaction of theprotein and substance. The reactions may be carried out in the liquidphase or the protein, substance, or test compound may be immobilized asdescribed herein.

[0139] It will be understood that the agonists and antagonists i.e.inhibitors and enhancers that can be assayed using the methods of theinvention may act on one or more of the binding sites on the protein orsubstance including agonist binding sites, competitive antagonistbinding sites, non-competitive antagonist binding sites or allostericsites.

[0140] The invention also makes it possible to screen for antagoniststhat inhibit the effects of an agonist of the interaction of a proteinof the invention with a substance which is capable of binding to theprotein. Thus, the invention may be used to assay for a compound thatcompetes for the same binding site of a protein of the invention.

[0141] Substances that modulate a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein ofthe invention can be identified based on their ability to interfere withor enhance the activity of a GlcNAc-TV-b Protein, GlcNAc-TV-b RelatedProtein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein. Therefore,the invention provides a method for evaluating a compound for itsability to modulate the activity of a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Proteincomprising (a) reacting an acceptor and a sugar donor for a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, orGlcNAc-TV-c Related Protein in the presence of a test substance; (b)measuring the amount of sugar donor transferred to acceptor, and (c)carrying out steps (a) and (b) in the absence of the test substance todetermine if the substance interferes with or enhances transfer of thesugar donor to the acceptor by the GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein.

[0142] Suitable acceptors for use in the method of the invention are asaccharide, oligosaccharides, polysaccharides, glycopeptides,glycoproteins, or glycolipids which are either synthetic with linkers atthe reducing end or naturally occurring structures, for example,asialo-agalacto-fetuin glycopeptide.

[0143] The sugar donor may be a nucleotide sugar,dolichol-phosphate-sugar or dolichol-pyrophosphate-oligosaccharide, forexample, uridine diphospho-N-acetylglucosamine (UDP-GlcNAc), orderivatives or analogs thereof. The GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein maybe obtained from natural sources or produced used recombinant methods asdescribed herein.

[0144] The acceptor or sugar donor may be labeled with a detectablesubstance as described herein, and the interaction of the protein of theinvention with the acceptor and sugar donor will give rise to adetectable change. The detectable change may be calorimetric,photometric, radiometric, potentiometric, etc. The activity of aGlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein of the invention may also be determinedusing methods based on HPLC (Koenderman et al., FEBS Lett. 222:42, 1987)or methods employed synthetic oligosaccharide acceptors attached tohydrophobic aglycones (Palcic et al Glycoconjugate 5:49, 1988; andPierce et al, Biochem. Biophys. Res. Comm. 146: 679, 1987).

[0145] The GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-cProtein, or GlcNAc-TV-c Related Protein is reacted with the acceptor andsugar donor at a pH and temperature and in the presence of a metalcofactor, usually a divalent cation like manganese, effective for theprotein to transfer the sugar donor to the acceptor, and where one ofthe components is labeled, to produce a detectable change. It ispreferred to use a buffer with the acceptor and sugar donor to maintainthe pH within the pH range effective for the proteins. The buffer,acceptor and sugar donor may be used as an assay composition. Othercompounds such as EDTA and detergents may be added to the assaycomposition.

[0146] The reagents suitable for applying the methods of the inventionto evaluate compounds that modulate a GlcNAc-TV-b Protein, GlcNAc-TV-bRelated Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein maybe packaged into convenient kits providing the necessary materialspackaged into suitable containers. The kits may also include suitablesupports useful in performing the methods of the invention.

Compositions and Treatments

[0147] The nucleic acid molecules and proteins of the invention andsubstances or compounds identified by the methods described herein,antibodies, and antisense nucleic acid molecules of the invention may beused for modulating the biological activity of a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein, and they may be used to treat or prevent cancer, inhibit ortreat tumor metastasis, stimulate hematopoictic progenitor cell growth,confer protection against chemotherapy and radiation therapy in asubject, and/or treat proliferative disorders, microbial or parasiticinfections, or neurological disorders.

[0148] The substances, compounds, etc. of the invention may beespecially useful in the treatment of various forms of neoplasia such asmelanomas, adenomas, sarcomas, and particularly carcinomas of solidtissues in patients. In particular the composition may be used fortreating cervico-uterine cancer, cancer of the kidney, brain, stomach,lung, rectum, breast, bowel, gastric, liver, thyroid, neck, cervix,salivary gland, bile duct, pelvis, mediastinum, urethra, bronchogenic,bladder, esophagus and colon, and Kaposi's Sarcoma which is a form ofcancer associated with HIV-infected patients with Acquired ImmuneDeficiency Syndrome (AIDS).

[0149] Accordingly, the proteins, substances, antibodies, and compoundsetc. may be formulated into pharmaceutical compositions foradministration to subjects in a biologically compatible form suitablefor administration in vivo. By “biologically compatible form suitablefor administration in vivo” is meant a form of the substance to beadministered in which any toxic effects are outweighed by thetherapeutic effects. The substances may be administered to livingorganisms including humans, and animals. Administration of atherapeutically active amount of the pharmaceutical compositions of thepresent invention is defined as an amount effective, at dosages and forperiods of time necessary to achieve the desired result. For example, atherapeutically active amount of a substance may vary according tofactors such as the disease state, age, sex, and weight of theindividual, and the ability of antibody to elicit a desired response inthe individual. Dosage regima may be adjusted to provide the optimumtherapeutic response. For example, several divided doses may beadministered daily or the dose may be proportionally reduced asindicated by the exigencies of the therapeutic situation.

[0150] The active substance may be administered in a convenient mannersuch as by injection (subcutaneous, intravenous, etc.), oraladministration, inhalation, transdermal application, or rectaladministration. Depending on the route of administration, the activesubstance may be coated in a material to protect the compound from theaction of enzymes, acids and other natural conditions that mayinactivate the compound.

[0151] The compositions described herein can be prepared by per se knownmethods for the preparation of pharmaceutically acceptable compositionswhich can be administered to subjects, such that an effective quantityof the active substance is combined in a mixture with a pharmaceuticallyacceptable vehicle. Suitable vehicles are described, for example, inRemington's Pharmaceutical Sciences (Remington's PharmaceuticalSciences, Mack Publishing Company, Easton, Pa., USA 1985). On thisbasis, the compositions include, albeit not exclusively, solutions ofthe substances or compounds in association with one or morepharmaceutically acceptable vehicles or diluents, and contained inbuffered solutions with a suitable pH and iso-osmotic with thephysiological fluids.

[0152] After pharmaceutical compositions have been prepared, they can beplaced in an appropriate container and labeled for treatment of anindicated condition. For administration of a composition of theinvention the labeling would include amount, frequency, and method ofadministration.

[0153] The compositions, substances, compounds etc. may be indicated astherapeutic agents either alone or in conjunction with other therapeuticagents or other forms of treatment (e.g. chemotherapy or radiotherapy).They can be used to enhance activation of macrophages, T cells, and NKcells in the treatment of cancer and immunosuppressive diseases. By wayof example, they can be used in combination with anti-proliferativeagents, antimicrobial agents, immunostimulatory agents, oranti-inflammatories. In particular, they can be used in combination withanti-viral and/or anti-proliferative agents, such as Th1 cytokinesincluding interleukin-2, interleukin-12, and interferon-, and nucleosideanalogues such as AZT and 3TC. They can be administered concurrently,separately, or sequentially with other therapeutic agents or therapies.

[0154] The nucleic acid molecules encoding a GlcNAc-TV-b Protein,GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, or GlcNAc-TV-c RelatedProtein or any fragment thereof, or antisense sequences may be used fortherapeutic purposes. Antisense to a nucleic acid molecule encoding aprotein of the invention may be used in situations to block thesynthesis of the protein. In particular, cells may be transformed withsequences complementary to nucleic acid molecules encoding a GlcNAc-TV-bProtein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein, orGlcNAc-TV-c Related Protein. Thus, antisense sequences may be used tomodulate GlcNAc-TV-b Protein, GlcNAc- TV-b Related Protein, GlcNAc-TV-cProtein, or GlcNAc-TV-c Related Protein activity, or to achieveregulation of gene function. Sense or antisense oligomers or largerfragments, can be designed from various locations along the coding orregulatory regions of sequences encoding a protein of the invention.

[0155] Expression vectors may be derived from retroviruses,adenoviruses, herpes or vaccinia viruses or from various bacterialplasmids for delivery of nucleic acid sequences to the target organ,tissue, or cells. Vectors that express antisense nucleic acid seqeuncesof glcNAc-TV-b or glcNAc-TV-c can be constructed using techniques wellknown to those skilled in the art (see for example, Sambrook et al.(supra)).

[0156] Genes encoding a GlcNAc-TV-b Protein, GlcNAc-TV-b RelatedProtein, GlcNAc-TV-c Protein, or GlcNAc-TV-c Related Protein can beturned off by transforming a cell or tissue with expression vectors thatexpress high levels of a nucleic acid molecule or fragment thereof whichencodes a protein of the invention. Such constructs may be used tointroduce untranslatable sense or antisense sequences into a cell. Evenif they do not integrate into the DNA, the vectors may continue totranscribe RNA molecules until all copies are disabled by endogenousnucleases. Transient expression may last for extended periods of time(e.g a month or more) with a non-replicating vector or if appropriatereplication elements are part of the vector system.

[0157] Modification of gene expression may be achieved by designingantisense molecules, DNA, RNA, or Peptide nucleic acid (PNA), to thecontrol regions of a glcNAc-TV-b or glcNAc-TV-c gene i.e. the promoters,enhancers, and introns. Preferably the antisense molecules areoligonucleotides derived from the transcription initiation site (e.g.between positions −10 and +10 from the start site). Inhibition can alsobe achieved by using triple-helix base-pairing techniques. Triple helixpairing causes inhibition of the ability of the double helix to opensufficiently for the binding of polymerases, transcription factors, orregulatory molecules (see Gee J. E. et al (1994) In: Huber, B. E. and B.I. Carr, Molecular and Immunologic Approaches, Futura Publishing Co.,Mt. Kisco, N.Y.). An antisense molecule may also be designed to blocktranslation of mRNA by inhibiting binding of the transcript to theribosomes.

[0158] Ribozymes may be used to catalyze the specific cleavage of RNA.Ribozyme action involves sequence-specific hybridization of the ribozymemolecule to complementary target RNA, followed by endonucleolyticcleavage. For example, hammerhead motif ribozyme molecules may beengineered that can specifically and efficiently catalyzeendonucleolytic cleavage of sequences encoding a polypeptide of theinvention.

[0159] Specific ribosome cleavage sites within any RNA target may beinitially identified by scanning the target molecule for ribozymecleavage sites which include the following sequences: GUA, GUU, and GUC.Short RNA sequences of between 15 and 20 ribonucleotides correspondingto the region of the cleavage site of the target gene may be evaluatedfor secondary structural features which may render the oligonucleotideinoperable. The suitability of candidate targets may be evaluated bytesting accessibility to hybridization with complementaryoligonucleotides using ribonuclease protection assays.

[0160] The activity of the proteins, nucleic acid molecules, substances,compounds, antibodies, antisense nucleic acid molecules, andcompositions of the invention may be confirmed in animal experimentalmodel systems.

[0161] The invention also provides methods for studying the function ofa GlcNAc-TV-b Protein, GlcNAc-TV-b Related Protein, GlcNAc-TV-c Protein,or GlcNAc-TV-c Related Protein. Cells, tissues, and non-human animalslacking in glcNAc-TV-b or glcNAc-TV-c expression or partially lacking inglcNAc-TV-b or glcNAc-TV-c expression may be developed using recombinantexpression vectors of the invention having specific deletion orinsertion mutations in the glcNAc-TV-b or glcNAc-TV-c gene. Arecombinant expression vector may be used to inactivate or alter theendogenous gene by homologous recombination, and thereby create aglcNAc-TV-b or glcNAc-TV-c deficient cell, tissue or animal.

[0162] Null alleles may be generated in cells, such as embryonic stemcells by deletion mutation. A recombinant glcNAc-TV-b or glcNAc-TV-cgene may also be engineered to contain an insertion mutation whichinactivates glcNAc-TV-b or glcNAc-TV-c. Such a construct may then beintroduced into a cell, such as an embryonic stem cell, by a techniquesuch as transfection, electroporation, injection etc. Cells lacking anintact glcNAc-TV-b or glcNAc-TV-c gene may then be identified, forexample by Southern blotting, Northern Blotting or by assaying forexpression of a protein of the invention using the methods describedherein. Such cells may then be used to generate transgenic non-humananimals deficient in glcNAc-TV-b or glcNAc-TV-c. Germline transmissionof the mutation may be achieved, for example, by aggregating theembryonic stem cells with early stage embryos, such as 8 cell embryos,in vitro; transferring the resulting blastocysts into recipient femalesand; generating germline transmission of the resulting aggregationchimeras. Such a mutant animal may be used to define specific cellpopulations, developmental patterns and in vivo processes, normallydependent on glcNAc-TV-b or glcNAc-TV-c expression.

[0163] A protein of the invention may be used to support the survival,growth, migration, and/or differentiation of cells expressing thepolypeptide. Thus, a polypeptide of the invention may be used as asupplement to support, for example cells in culture.

Methods for Preparing Oligosaccharides

[0164] The invention relates to a method for preparing anoligosaccharide comprising contacting a reaction mixture comprising anactivated GlcNAc and an acceptor in the presence of a protein of theinvention.

[0165] Examples of acceptors for use in the method for preparing anoligosaccharide are a saccharide, oligosaccharides, polysaccharides,glycopeptides, glycoproteins, or glycolipids which are either syntheticwith linkers at the reducing end or naturally occurring structures, forexample, asialo-agalacto-fetuin glycopeptide. The activated GlcNAc maybe part of a nucleotide-sugar, a dolichol-phosphate-sugar, ordolichol-pyrophosphate-oligosaccharide.

[0166] In an embodiment of the invention, the oligosaccharides areprepared on a carrier that is non-toxic to a mammal, in particular alipid isoprenoid or polyisoprenoid alcohol. An example of a suitablecarrier is dolichol phosphate. The oligosaccharide may be attached to acarrier via a labile bond allowing for chemical removal of theoligosaccharide from the lipid carrier. In the alternative, theoligosaccharide transferase may be used to transfer the oligosaccharideform a lipid carrier to a protein.

[0167] The following non-limiting examples are illustrative of thepresent invention:

EXAMPLE 1 Isolation of Human GlcNAc-TVb

[0168] A cDNA sequence of a human GlcNAc-TV homolog was identified bysimilarity matching using the GeneBank ESTdatabase (accession numberR87580). This EST cDNA clone (designated as hGTNVb) was sequenced (627base pairs) and when translated was shown to be 67% identical to the 3′end of the human GlcNAc-TV amino acid sequence. This informationinitiated a search for the entire sequence of this human GlcNAc-TV-likecDNA using two different methods; screening a human brain cDNA libraryby colony plaque lifts and 5′ RACE (rapid amplification of cDNA ends).

[0169] A human brain 5′STRETCH PLUS cDNA library (gt10-CLONTECH (Cat#HL3002A) was screened (using standard protocols) with a ³²P-dCTPlabeled 203 base pair cDNA probe generated by restriction enzymedigestions of the hGlcNAc-TV-b EST cDNA with NotI and BamHI. Two millionphage clones were screened and 4 positive clones were identified. Eachof these clones was purified to homogeneity by three subsequent roundsof screening and phage DNA was isolated from each of these clones usingconventional methods. The CDNA insert was isolated from each of theseclones and then subcloned into the EcoRI site of the Bluescript vector(Stratagene) and sequenced. Two out of four clones had sequences thatwere identical to the EST clone and thereby provided no new information.The other two clones were found to be similar to hGlcNAc-TV-b. One clone(1820 base pairs) was identical in sequence to the coding region of theEST clone with an additional 1295 base pairs of 3′ untranslated sequenceand the other clone was 61% identical (amino acid comparison) withhGlcNAc-TV-b and was designated as hGlcNAc-TV-c. Interestingly the 3′ends of hGlcNAc-TV-b and hGlcNAc-TV-c are very dissimilar suggestingthat one of these clones is a splice variant of the other.

[0170] The 5′ RACE protocol was used to isolate the 5′ region of thehGlcNAc-TV-b cDNA sequence. First strand cDNA synthesis was performedusing a PCR primer that was incubated (primerTVB#1A-CCAGACCTGGTCGGCCCCTGCAGCCACAG) (SEQ ID NO. 13) (100 mM finalconcentration) with 2 μg of mRNA from PFSK-1 cells (ATCC CRL-2060primitive neuroectodermal tumor) and incubated for 10 minutes at 85° C.and then chilled on ice for 1 minute. To this mixture was added, tofinal concentrations, 20 mM Tris-HCl (pH 8.4), 50 mM KCl, 2.5 mM MgCl₂,10 mM DTT, 400 μM each dATP, dCTP, dGTP, dTTP and 200 Units ofSuperscript II RT (GIBCO-BRL) and incubated for 50 minutes at 42° C. Thereaction was terminated by placing it at 70° C. for 15 minutes which wasthen incubated with 2 Units of RNAse and incubated for an additional 30minutes. The generated cDNA was purified by using GlassMax DNA spincartridges following the manufacturer's instructions (GIBCO-BRL). Theisolated cDNA was tailed with terminal deoxynucleotidyl transferase(TdT) that added homopolymeric dCTP tails to the 3′ ends of the cDNA ina reaction that was incubated for 10 minutes at 37° C. with a finalcomposition of 10 mM Tris-HCl (pH 8.4), 25 mM KCl, 1.5 mM MgCl₂, 200 μMdCTP and 1 Unit of TdT. The TdT was heat inactivated for 10 minutes at65° C. The tailed cDNA (5 μl) was amplified by PCR using two primers(primer TVB#1B-GGAGGCAGCCCCGGGAGCTGGGAG (SEQ ID NO. 14) and an AbridgedAnchor primer-sequence not provided from GIBCO-BRL) with the finalcomposition of the reaction as 20 mM Tris-HCl (pH 8.4), 50 mM KCl, 1.5mM MgCl₂, 400 mM primer TVB#1B, 400 mM Abridged Anchor primer, 200 μMeach dATP, dCTP, dGTP, dTTP and 2.5 Units of Taq DNA polymerase. Thisreaction was transferred to a thermal cycler preequilibrated to 94° C.Thirty five cycles of PCR was performed with the following cyclingprotocol: predenaturation at 94° C. for 2 minutes, denaturation at 94°C. for 1 minute, annealing of primers at 58° C. for 1.5 minutes, primerextension at 72° C. for 2.5 minutes and final extension at 72° C. for 10minutes. The 5′ RACE products were analyzed using standard agarose gelelectrophoresis protocols. No visible bands were observed therefore theregion above 1.6 kb marker was isolated using a DNA gel extraction kitfrom Stratagene and subcloned into the T/A Bluescript vector usingstandard procedures. Several cDNA fragments were subcloned into theBluescript vector and were sequenced. Only one clone containing a 1.7 kbcDNA fragment was similar to hGlcNAc-TV-b. The actual size of this cDNAfragment is 1676 base pairs which did not encompass the entirehGlcNAc-TV-b clone, therefore a second round of 5′ RACE was performedusing the same protocol as above with different primers. To isolate the5′ end of hGlcNAc-TV-b, another primer TVB#2A(GGTCAAGATAAATGCGTTTTTCCACCGATC) (SEQ ID NO. 15) was used in place ofprimer TVB#1A, and TVB#2B (GTGGATTATATCCTATGGCAGAAAAGCTTTATAT) (SEQ IDNO. 16) was used replacing TVB#2A. This set of primers generated threecDNA fragments (3, 1.7 and 1.4 kb) which were isolated following themanufacturer's instructions using a DNA gel extraction kit fromStratagene and subcloned into the T/A Bluescript vector using standardprocedures. Each of the cDNA fragments were sequenced which revealedthat only the 1.4 kb fragment was similar to hGlcNAc-TV and representsthe 5′ end of hGlcNAc-TV-b. The actual size of this fragment is 1440base pairs.

[0171] The entire cDNA sequence of hGlcNAc-TV-b is 4541 base pairs andwas reconstructed by first isolating a 1431 base pair band (designatedband A) (Stratagene gel extraction kit) from the 1440 base pair 5′ endof hGTNV (from the second round of 5′ RACE) by restriction enzymedigestion with HindIII. Second, the middle section of hGlcNAc-TV-b (1623base pairs-designated band B) was isolated from the 1676 base pairhGlcNAc-TVb fragment (from the first round of 5′ RACE) by restrictionenzyme digestions with HindIII and SmaI and then ligated (using standardprotocols) to band A. And finally the 3′ end of hGTNVb was isolated byusing the SmaI restriction enzyme to isolate a 1487 base pair band(designated band C). Band C was then ligated to band A+B to generate theentire nontranslated and translated sequence of hGlcNAc-TV-b.

EXAMPLE 2 Expression of GlcNAcTV-b

[0172] Northern Blot Analysis of Human Tissues

[0173] Human multiple tissue and tumor cell line Northern blots wereobtained from Clontech. The Northern blot containing mRNA from humanbreast and uterus cancer tissues as well as normal tissues was obtainedfrom Invitrogen. All Northern blots contained 2 g of mRNA/lane. Theseblots were hybridized with [α-³²P]dCTP-labeled hGlcNAc-TV (nucleotides1508-1921) and GlcNac-TV-b (nucleotides 1959-2417) cDNAs. Amershammultiprime DNA labeling kit and [α-³²P]dCTP (3000 Ci/mol) were used forlabeling. Northern blots were hybridized under stringent conditionsfollowing the recommended protocol (Clontech) and exposed to x-ray filmor phosphoimager.

[0174] Results

[0175] The expression pattern of the two GlcNAc-TVs was examined indifferent human tissues. Hybridization of GlcNAc-TV cDNA probe toNorthern blots under stringent conditions revealed the wide expressionof two transcripts ranging in size from 7.4-9.5 kb (FIG. 1). The majortranscript 9.3 kb was expressed in most tissues as well as in differentparts of human brain (FIG. 2). The 9.3 kb and 7.4 kb transcripts werenot detected in human tumor cell lines with the exception of humancolorectal cell line SW480 (FIG. 3). Although in this case the 7.4 kbtranscript was a predominant one. When the same blots were tested withGlcNAc TV-b cDNA probe, a very different pattern of tissue specificexpression was observed. The high levels of 4.5 kb transcript wereexpressed in brain tissue and low levels in testis (FIG. 1). Thepresence of this transcript was not detected in other tested tissues.The GlcNAc-TV-b transcript was expressed throughout the adult brain withthe exception of spinal cord (FIG. 2). Four cell lines derived fromsolid tumors revealed expression of GlcNAc-TVb, whereas the 4.5 kbtranscript was not detected in leukemia and lymphoma (FIG. 3). The highexpression of GlcNAc-TVb was detected in two different human tumortissues (breast and uterus) whereas normal tissue, adjacent to tumortissues showed very low levels of GlcNAc-TVb transcript.

[0176] Having illustrated and described the principles of the inventionin a preferred embodiment, it should be appreciated to those skilled inthe art that the invention can be modified in arrangement and detailwithout departure from such principles. All modifications coming withinthe scope of the following claims are claimed.

[0177] All publications, patents and patent applications referred toherein are incorporated by reference in their entirety to the sameextent as if each individual publication, patent or patent applicationwas specifically and individually indicated to be incorporated byreference in its entirety. Nothing herein is to be construed as anadmission that the invention is not entitled to antedate suchdisclosures by virtue of prior invention.

1 16 1 2061 DNA Homo sapiens 1 atgtttttta caatctcaag aaaaaatatgtcccagaaat tgagtttact gttgcttgta 60 tttggactca tttggggatt gatgttactgcactatactt ttcaacaacc aagacatcaa 120 agcagtgtca agttacgtga gcaaatactagacttaagca aaagatatgt taaagctcta 180 gcagaggaaa ataagaacac agtggatgtcgagaacggtg cttctatggc aggatatgcg 240 gatctgaaaa gaacaattgc tgtccttctggatgacattt tgcaacgatt ggtgaagctg 300 gagaacaaag ttgactatat tgttgtgaatggctcagcag ccaacaccac caatggtact 360 agtgggaatt tggtgccagt aaccacaaataaaagaacga atgtctcggg cagtatcagg 420 atagcagttg aaaatcacct tgtgctgctccatccactgt ggattatatc ctatggcaga 480 aaagctttat attgctggct taggacagaggcaatacttt acaataaaag cactaacgga 540 ggtcaagata aatgcgtttt tccaccgatcgacggttacc cacactacga gggaaaaatt 600 aagtggataa atgacatgtg ccgttcggatccgtgcaagg ctcattatgg tatagatggg 660 tccagttgca ctttttttat atacctcagtgacgccgaca atcattgtcc ccatgcaccc 720 tggagacata aaaatcctta cgacgacgctgagcataatt catgcgctga aattcgtagt 780 gattttgaac ttctgtacag tgtgattcatcataaggacg agttccattt tatgagacta 840 cggagacggc gaatggttga gggatgggcccaaatcgcaa agtccctagc agataagcag 900 aacgcagaga agaaaaaacg gaaaaaggccctagttcacc tgggaatcat taccaaggac 960 actgtatcta agattgctga aacaggtttcagtgccgcac ctcttggtga cttagttcat 1020 tggagtgatg taattacatc tgcgtacgcagcggggcatg acgttaggat cactgcatca 1080 ctggctgagc tcaaggatgt cgtgaagaagattataggta accgatctgg ttgcccatct 1140 gtaggagaca gaattgttga gctactttacgctgatgtaa ttggactcgg tcaattcaag 1200 aaaactctag gtccaacctg ggctcaacatcggtggatgg ttcgagtcct tgaaactttt 1260 ggatcagatc ccgattttga acatgccaattatgcgcaaa caaagggtca caagagccct 1320 tggggatggt ggaatctgaa ccctaataacttttatacaa tgttccccca tactccagaa 1380 aacacttttc ttgggtttgc gatcgagcagcacctaaact ccagtgatat gcaccacctt 1440 aatgagatga agaggcagaa tcagacgcttgtgtatggca aagtggatag cttctggaag 1500 aataagcata tttacttcga aatcattcacaattacatcg aagtgcaagc aactgtgtat 1560 gactcctcta cacccaatat tccctcttactctcgaaacc acggtattct ttctggtcgg 1620 gaccatcgat tcctcctccg agagaccttcttgttactag gactagggac tccttacgaa 1680 cgttgcgctc cgctggaagc catggcaaatcgatgcgtct ttctcaaacc gaagttcccc 1740 ccacccaatt caaggaagaa tacagagtttttacgaggca agcccacctc cagagaggtg 1800 ttctcccagc atccctacgc ggagaacttcatcggcaagc cccacgtgtg gacagtcgac 1860 tacaacaact cagaggagtt tgaagcagccatcaaggcca ttatgagaac tcaggtagac 1920 ccctacctac cctacgagta cacctgcgaggggatgctgg agcggatcac cgcctacatc 1980 cagcaccagg acttctgcag agcttcagaacactgccacc cacccagttt tataatccgc 2040 tccctctcca gggcaacccc a 2061 2 687PRT Homo sapiens 2 Met Phe Phe Thr Ile Ser Arg Lys Asn Met Ser Gln LysLeu Ser Leu 1 5 10 15 Leu Leu Leu Val Phe Gly Leu Ile Trp Gly Leu MetLeu Leu His Tyr 20 25 30 Thr Phe Gln Gln Pro Arg His Gln Ser Ser Val LysLeu Arg Glu Gln 35 40 45 Ile Leu Asp Leu Ser Lys Arg Tyr Val Lys Ala LeuAla Glu Glu Asn 50 55 60 Lys Asn Thr Val Asp Val Glu Asn Gly Ala Ser MetAla Gly Tyr Ala 65 70 75 80 Asp Leu Lys Arg Thr Ile Ala Val Leu Leu AspAsp Ile Leu Gln Arg 85 90 95 Leu Val Lys Leu Glu Asn Lys Val Asp Tyr IleVal Val Asn Gly Ser 100 105 110 Ala Ala Asn Thr Thr Asn Gly Thr Ser GlyAsn Leu Val Pro Val Thr 115 120 125 Thr Asn Lys Arg Thr Asn Val Ser GlySer Ile Arg Ile Ala Val Glu 130 135 140 Asn His Leu Val Leu Leu His ProLeu Trp Ile Ile Ser Tyr Gly Arg 145 150 155 160 Lys Ala Leu Tyr Cys TrpLeu Arg Thr Glu Ala Ile Leu Tyr Asn Lys 165 170 175 Ser Thr Asn Gly GlyGln Asp Lys Cys Val Phe Pro Pro Ile Asp Gly 180 185 190 Tyr Pro His TyrGlu Gly Lys Ile Lys Trp Ile Asn Asp Met Cys Arg 195 200 205 Ser Asp ProCys Lys Ala His Tyr Gly Ile Asp Gly Ser Ser Cys Thr 210 215 220 Phe PheIle Tyr Leu Ser Asp Ala Asp Asn His Cys Pro His Ala Pro 225 230 235 240Trp Arg His Lys Asn Pro Tyr Asp Asp Ala Glu His Asn Ser Cys Ala 245 250255 Glu Ile Arg Ser Asp Phe Glu Leu Leu Tyr Ser Val Ile His His Lys 260265 270 Asp Glu Phe His Phe Met Arg Leu Arg Arg Arg Arg Met Val Glu Gly275 280 285 Trp Ala Gln Ile Ala Lys Ser Leu Ala Asp Lys Gln Asn Ala GluLys 290 295 300 Lys Lys Arg Lys Lys Ala Leu Val His Leu Gly Ile Ile ThrLys Asp 305 310 315 320 Thr Val Ser Lys Ile Ala Glu Thr Gly Phe Ser AlaAla Pro Leu Gly 325 330 335 Asp Leu Val His Trp Ser Asp Val Ile Thr SerAla Tyr Ala Ala Gly 340 345 350 His Asp Val Arg Ile Thr Ala Ser Leu AlaGlu Leu Lys Asp Val Val 355 360 365 Lys Lys Ile Ile Gly Asn Arg Ser GlyCys Pro Ser Val Gly Asp Arg 370 375 380 Ile Val Glu Leu Leu Tyr Ala AspVal Ile Gly Leu Gly Gln Phe Lys 385 390 395 400 Lys Thr Leu Gly Pro ThrTrp Ala Gln His Arg Trp Met Val Arg Val 405 410 415 Leu Glu Thr Phe GlySer Asp Pro Asp Phe Glu His Ala Asn Tyr Ala 420 425 430 Gln Thr Lys GlyHis Lys Ser Pro Trp Gly Trp Trp Asn Leu Asn Pro 435 440 445 Asn Asn PheTyr Thr Met Phe Pro His Thr Pro Glu Asn Thr Phe Leu 450 455 460 Gly PheAla Ile Glu Gln His Leu Asn Ser Ser Asp Met His His Leu 465 470 475 480Asn Glu Met Lys Arg Gln Asn Gln Thr Leu Val Tyr Gly Lys Val Asp 485 490495 Ser Phe Trp Lys Asn Lys His Ile Tyr Phe Glu Ile Ile His Asn Tyr 500505 510 Ile Glu Val Gln Ala Thr Val Tyr Asp Ser Ser Thr Pro Asn Ile Pro515 520 525 Ser Tyr Ser Arg Asn His Gly Ile Leu Ser Gly Arg Asp His ArgPhe 530 535 540 Leu Leu Arg Glu Thr Phe Leu Leu Leu Gly Leu Gly Thr ProTyr Glu 545 550 555 560 Arg Cys Ala Pro Leu Glu Ala Met Ala Asn Arg CysVal Phe Leu Lys 565 570 575 Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys AsnThr Glu Phe Leu Arg 580 585 590 Gly Lys Pro Thr Ser Arg Glu Val Phe SerGln His Pro Tyr Ala Glu 595 600 605 Asn Phe Ile Gly Lys Pro His Val TrpThr Val Asp Tyr Asn Asn Ser 610 615 620 Glu Glu Phe Glu Ala Ala Ile LysAla Ile Met Arg Thr Gln Val Asp 625 630 635 640 Pro Tyr Leu Pro Tyr GluTyr Thr Cys Glu Gly Met Leu Glu Arg Ile 645 650 655 Thr Ala Tyr Ile GlnHis Gln Asp Phe Cys Arg Ala Ser Glu His Cys 660 665 670 His Pro Pro SerPhe Ile Ile Arg Ser Leu Ser Arg Ala Thr Pro 675 680 685 3 4541 DNA Homosapiens 3 ggctcttacc gcagcctgag tttcagcagc tgctgcgcaa ggccaaactcttcctcgggt 60 ttggcttccc ctacgagggc cccgcccccc tggaggccat cgccaatggttgcatcttcc 120 tgcagtcccg cttcagcccg ccccacagct ccctcaacca cgagttcttcccaggcaagc 180 ccacctccag agaggtgttc tcccagcatc cctacgcgga gaacttcatcggcaagcccc 240 acgtgtggac agtcgactac aacaactcag aggagtttga agcagccatcaaggccatta 300 tgagaactca ggtagacccc tacctaccct acgagtacac ctgcgaggggatgctggagc 360 ggatccacgc ctacatccag caccaggact tctgcagagc tccagaccactgccctacca 420 gaggcccacg ccccgcagag cccctttgtc ctggccccca atgccacccacctcgagtgg 480 gctcggaaca ccagcttggc tcctggggcc tggcccccgc gcacaccctgcgggcctggc 540 tggccgtgcc tgggagggcc tgcaccgaca cctgcctgga ccacgggctaatctgtgagc 600 cctccttctt ccccttcctg aacagccagg acgccttcct caagctgcaggtgccctgtg 660 acagcaccga gtcggagatg aaccacctgt actctcggcg ttcgcccagcctggccagga 720 gtgctacctg cagaaggagc ctctgctctt cagtgcgccg gctccaacaccaagtaccgc 780 cggctctgcc cctgccgcga cttccgcaag cggaattccg gccggaattccggaattctt 840 ttgcttttta cgagtcgagt tttttttctt ttttttttca agtcttgatttgtggcttac 900 ctcaagttac catttttcag tcaagtctgt ttgtttgctt cttcagaaatgttttttaca 960 atctcaagaa aaaatatgtc ccagaaattg agtttactgt tgcttgtatttggactcatt 1020 tggggattga tgttactgca ctatactttt caacaaccaa gacatcaaagcagtgtcaag 1080 ttacgtgagc aaatactaga cttaagcaaa agatatgtta aagctctagcagaggaaaat 1140 aagaacacag tggatgtcga gaacggtgct tctatggcag gatatgcggatctgaaaaga 1200 acaattgctg tccttctgga tgacattttg caacgattgg tgaagctggagaacaaagtt 1260 gactatattg ttgtgaatgg ctcagcagcc aacaccacca atggtactagtgggaatttg 1320 gtgccagtaa ccacaaataa aagaacgaat gtctcgggca gtatcaggatagcagttgaa 1380 aatcaccttg tgctgctcca tccactgtgg attatatcct atggcagaaaagctttatat 1440 tgctggctta ggacagaggc aatactttac aataaaagca ctaacggaggtcaagataaa 1500 tgcgtttttc caccgatcga cggttaccca cactacgagg gaaaaattaagtggataaat 1560 gacatgtgcc gttcggatcc gtgcaaggct cattatggta tagatgggtccagttgcact 1620 ttttttatat acctcagtga cgccgacaat cattgtcccc atgcaccctggagacataaa 1680 aatccttacg acgacgctga gcataattca tgcgctgaaa ttcgtagtgattttgaactt 1740 ctgtacagtg tgattcatca taaggacgag ttccatttta tgagactacggagacggcga 1800 atggttgagg gatgggccca aatcgcaaag tccctagcag ataagcagaacgcagagaag 1860 aaaaaacgga aaaaggccct agttcacctg ggaatcatta ccaaggacactgtatctaag 1920 attgctgaaa caggtttcag tgccgcacct cttggtgact tagttcattggagtgatgta 1980 attacatctg cgtacgcagc ggggcatgac gttaggatca ctgcatcactggctgagctc 2040 aaggatgtcg tgaagaagat tataggtaac cgatctggtt gcccatctgtaggagacaga 2100 attgttgagc tactttacgc tgatgtaatt ggactcggtc aattcaagaaaactctaggt 2160 ccaacctggg ctcaacatcg gtggatggtt cgagtccttg aaacttttggatcagatccc 2220 gattttgaac atgccaatta tgcgcaaaca aagggtcaca agagcccttggggatggtgg 2280 aatctgaacc ctaataactt ttatacaatg ttcccccata ctccagaaaacacttttctt 2340 gggtttgcga tcgagcagca cctaaactcc agtgatatgc accaccttaatgagatgaag 2400 aggcagaatc agacgcttgt gtatggcaaa gtggatagct tctggaagaataagcatatt 2460 tacttcgaaa tcattcacaa ttacatcgaa gtgcaagcaa ctgtgtatgactcctctaca 2520 cccaatattc cctcttactc tcgaaaccac ggtattcttt ctggtcgggaccatcgattc 2580 ctcctccgag agaccttctt gttactagga ctagggactc cttacgaacgttgcgctccg 2640 ctggaagcca tggcaaatcg atgcgtcttt ctcaaaccga agttccccccacccaattca 2700 aggaagaata cagagttttt acgaggcaag cccacctcca gagaggtgttctcccagcat 2760 ccctacgcgg agaacttcat cggcaagccc cacgtgtgga cagtcgactacaacaactca 2820 gaggagtttg aagcagccat caaggccatt atgagaactc aggtagacccctacctaccc 2880 tacgagtaca cctgcgaggg gatgctggag cggatcaccg cctacatccagcaccaggac 2940 ttctgcagag cttcagaaca ctgccaccca cccagtttta taatccgctccctctccagg 3000 gcaaccccac ccaccagcct aggcctgctc ctccaccttc cgggaggcagccccgggagc 3060 tgggagctgg tggaggggcc aggctggacg cttcccgtgg gagtcccctccagacctggt 3120 cggcccctgc agccacagaa ccacgatggc aaaaaatcta tttgttctcaaggactaacc 3180 tttgggggga aagcaataga gacactcttt ttctctcttt ttttaaagatttatttcttt 3240 aaataataaa tattttattt ggatgtgagg tgcagaagag aaaaaaaaaaaaaaaaaaaa 3300 aagcgcggcc gcaagcttat tccctttagt gagggttaat ttaaaaagcaaaagaattcc 3360 ggcctgagct cagctaggac agtgactatt taatatagtt aatgccaggaactttcaccc 3420 cacgtatgga agagttcaat cttagagtag acaccttgtg aatacacaaaccaacactcc 3480 cttctgaatt ctcattccta gcacattgtc cttacagatt cccaggggacaccaagaggt 3540 ttttgcctat ataaaattaa ctagcaacag taaatggtga agtcctaattaaataagcat 3600 gggttaaaag ccagtcgtct gctaagatgg tgaagggtgt ccccatccccatgtttaata 3660 aatgattgct gaatccacaa ttcctctaaa gttgatggga aagtttccatctttcagata 3720 agagcatatt atcaacggtt aaaggatatc ccaggccctc cagcaaatgccttctggaat 3780 catctccaca ttcagacaca tcgtaaacaa cagaggggca atactcatgcttcgcaaaag 3840 ccgttcattc cccttggcaa aggcgggaga gagggctcac caaccttggagaagcctggt 3900 ttacatcgtc aaggtagcta ctgccctcta gtgttgatat gggaataaagcaaaaaagta 3960 tacctggttg aaacgaaacc gaactccaca aagtttttca attactgatgtgtctcagca 4020 gccttggtag gagcttggaa aacatcatca ggtgaggata ttgcactggagctgacctct 4080 tgtggcttct aaagtttctt tttttttttt tttttttttt tgagacagagtctcactgtg 4140 tcacccaggc tggagtgcat tttcttgtgt ccaaccaaga ctcacataccatctcagctc 4200 actgcaacct ccacctccca ggttcaagag atgctcctgc cctagcctcccaagtagctg 4260 ggatcacagg catgtgccac cacacccagc taagttttgt atttttagaagagatggggt 4320 ttcacgatgt tggccagact ggtctcgaac tcctgaccta aagtgatccacctgccttgg 4380 cttcccaaaa tgctggatta caggtgtgaa ccactgcacc tggcctccaagatttctatt 4440 tggcaaattc acatagctac tttcatactt gttaaaatac cgaaatgcttccataccagt 4500 tagcaaaagg ccacccggaa ttcagcttgg acttaaccag g 4541 41485 PRT Homo sapiens 4 Gly Ser Tyr Arg Ser Leu Ser Phe Ser Ser Cys CysAla Arg Pro Asn 1 5 10 15 Ser Ser Ser Gly Leu Ala Ser Pro Thr Arg AlaPro Pro Pro Trp Arg 20 25 30 Pro Ser Pro Met Val Ala Ser Ser Cys Ser ProAla Ser Ala Arg Pro 35 40 45 Thr Ala Pro Ser Thr Thr Ser Ser Ser Gln AlaSer Pro Pro Pro Glu 50 55 60 Arg Cys Ser Pro Ser Ile Pro Thr Arg Arg ThrSer Ser Ala Ser Pro 65 70 75 80 Thr Cys Gly Gln Ser Thr Thr Thr Thr GlnArg Ser Leu Lys Gln Pro 85 90 95 Ser Arg Pro Leu Glu Leu Arg Thr Pro ThrTyr Pro Thr Ser Thr Pro 100 105 110 Ala Arg Gly Cys Trp Ser Gly Ser ThrPro Thr Ser Ser Thr Arg Thr 115 120 125 Ser Ala Glu Leu Gln Thr Thr AlaLeu Pro Glu Ala His Ala Pro Gln 130 135 140 Ser Pro Phe Val Leu Ala ProAsn Ala Thr His Leu Glu Trp Ala Arg 145 150 155 160 Asn Thr Ser Leu AlaPro Gly Ala Trp Pro Pro Arg Thr Pro Cys Gly 165 170 175 Pro Gly Trp ProCys Leu Gly Gly Pro Ala Pro Thr Pro Ala Trp Thr 180 185 190 Thr Gly SerVal Ser Pro Pro Ser Ser Pro Ser Thr Ala Arg Thr Pro 195 200 205 Ser SerSer Cys Arg Cys Pro Val Thr Ala Pro Ser Arg Arg Thr Thr 210 215 220 CysThr Leu Gly Val Arg Pro Ala Trp Pro Gly Val Leu Pro Ala Glu 225 230 235240 Gly Ala Ser Ala Leu Gln Cys Ala Gly Ser Asn Thr Lys Tyr Arg Arg 245250 255 Leu Cys Pro Cys Arg Asp Phe Arg Lys Arg Asn Ser Gly Arg Asn Ser260 265 270 Gly Ile Leu Leu Leu Phe Thr Ser Arg Val Phe Phe Leu Phe PhePhe 275 280 285 Lys Ser Phe Val Ala Tyr Leu Lys Leu Pro Phe Phe Ser GlnVal Cys 290 295 300 Leu Phe Ala Ser Ser Glu Met Phe Phe Thr Ile Ser ArgLys Asn Met 305 310 315 320 Ser Gln Lys Leu Ser Leu Leu Leu Leu Val PheGly Leu Ile Trp Gly 325 330 335 Leu Met Leu Leu His Tyr Thr Phe Gln GlnPro Arg His Gln Ser Ser 340 345 350 Val Lys Leu Arg Glu Gln Ile Leu AspLeu Ser Lys Arg Tyr Val Lys 355 360 365 Ala Leu Ala Glu Glu Asn Lys AsnThr Val Asp Val Glu Asn Gly Ala 370 375 380 Ser Met Ala Gly Tyr Ala AspLeu Lys Arg Thr Ile Ala Val Leu Leu 385 390 395 400 Asp Asp Ile Leu GlnArg Leu Val Lys Leu Glu Asn Lys Val Asp Tyr 405 410 415 Ile Val Val AsnGly Ser Ala Ala Asn Thr Thr Asn Gly Thr Ser Gly 420 425 430 Asn Leu ValPro Val Thr Thr Asn Lys Arg Thr Asn Val Ser Gly Ser 435 440 445 Ile ArgIle Ala Val Glu Asn His Leu Val Leu Leu His Pro Leu Trp 450 455 460 IleIle Ser Tyr Gly Arg Lys Ala Leu Tyr Cys Trp Leu Arg Thr Glu 465 470 475480 Ala Ile Leu Tyr Asn Lys Ser Thr Asn Gly Gly Gln Asp Lys Cys Val 485490 495 Phe Pro Pro Ile Asp Gly Tyr Pro His Tyr Glu Gly Lys Ile Lys Trp500 505 510 Ile Asn Asp Met Cys Arg Ser Asp Pro Cys Lys Ala His Tyr GlyIle 515 520 525 Asp Gly Ser Ser Cys Thr Phe Phe Ile Tyr Leu Ser Asp AlaAsp Asn 530 535 540 His Cys Pro His Ala Pro Trp Arg His Lys Asn Pro TyrAsp Asp Ala 545 550 555 560 Glu His Asn Ser Cys Ala Glu Ile Arg Ser AspPhe Glu Leu Leu Tyr 565 570 575 Ser Val Ile His His Lys Asp Glu Phe HisPhe Met Arg Leu Arg Arg 580 585 590 Arg Arg Met Val Glu Gly Trp Ala GlnIle Ala Lys Ser Leu Ala Asp 595 600 605 Lys Gln Asn Ala Glu Lys Lys LysArg Lys Lys Ala Leu Val His Leu 610 615 620 Gly Ile Ile Thr Lys Asp ThrVal Ser Lys Ile Ala Glu Thr Gly Phe 625 630 635 640 Ser Ala Ala Pro LeuGly Asp Leu Val His Trp Ser Asp Val Ile Thr 645 650 655 Ser Ala Tyr AlaAla Gly His Asp Val Arg Ile Thr Ala Ser Leu Ala 660 665 670 Glu Leu LysAsp Val Val Lys Lys Ile Ile Gly Asn Arg Ser Gly Cys 675 680 685 Pro SerVal Gly Asp Arg Ile Val Glu Leu Leu Tyr Ala Asp Val Ile 690 695 700 GlyLeu Gly Gln Phe Lys Lys Thr Leu Gly Pro Thr Trp Ala Gln His 705 710 715720 Arg Trp Met Val Arg Val Leu Glu Thr Phe Gly Ser Asp Pro Asp Phe 725730 735 Glu His Ala Asn Tyr Ala Gln Thr Lys Gly His Lys Ser Pro Trp Gly740 745 750 Trp Trp Asn Leu Asn Pro Asn Asn Phe Tyr Thr Met Phe Pro HisThr 755 760 765 Pro Glu Asn Thr Phe Leu Gly Phe Ala Ile Glu Gln His LeuAsn Ser 770 775 780 Ser Asp Met His His Leu Asn Glu Met Lys Arg Gln AsnGln Thr Leu 785 790 795 800 Val Tyr Gly Lys Val Asp Ser Phe Trp Lys AsnLys His Ile Tyr Phe 805 810 815 Glu Ile Ile His Asn Tyr Ile Glu Val GlnAla Thr Val Tyr Asp Ser 820 825 830 Ser Thr Pro Asn Ile Pro Ser Tyr SerArg Asn His Gly Ile Leu Ser 835 840 845 Gly Arg Asp His Arg Phe Leu LeuArg Glu Thr Phe Leu Leu Leu Gly 850 855 860 Leu Gly Thr Pro Tyr Glu ArgCys Ala Pro Leu Glu Ala Met Ala Asn 865 870 875 880 Arg Cys Val Phe LeuLys Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys 885 890 895 Asn Thr Glu PheLeu Arg Gly Lys Pro Thr Ser Arg Glu Val Phe Ser 900 905 910 Gln His ProTyr Ala Glu Asn Phe Ile Gly Lys Pro His Val Trp Thr 915 920 925 Val AspTyr Asn Asn Ser Glu Glu Phe Glu Ala Ala Ile Lys Ala Ile 930 935 940 MetArg Thr Gln Val Asp Pro Tyr Leu Pro Tyr Glu Tyr Thr Cys Glu 945 950 955960 Gly Met Leu Glu Arg Ile Thr Ala Tyr Ile Gln His Gln Asp Phe Cys 965970 975 Arg Ala Ser Glu His Cys His Pro Pro Ser Phe Ile Ile Arg Ser Leu980 985 990 Ser Arg Ala Thr Pro Pro Thr Ser Leu Gly Leu Leu Leu His LeuPro 995 1000 1005 Gly Gly Ser Pro Gly Ser Trp Glu Leu Val Glu Gly ProGly Trp Thr 1010 1015 1020 Leu Pro Val Gly Val Pro Ser Arg Pro Gly ArgPro Leu Gln Pro Gln 1025 1030 1035 1040 Asn His Asp Gly Lys Lys Ser IleCys Ser Gln Gly Leu Thr Phe Gly 1045 1050 1055 Gly Lys Ala Ile Glu ThrLeu Phe Phe Ser Leu Phe Leu Lys Ile Tyr 1060 1065 1070 Phe Phe Lys IlePhe Tyr Leu Asp Val Arg Cys Arg Arg Glu Lys Lys 1075 1080 1085 Lys LysLys Lys Lys Arg Gly Arg Lys Leu Ile Pro Phe Ser Glu Gly 1090 1095 1100Phe Lys Lys Gln Lys Asn Ser Gly Leu Ser Ser Ala Arg Thr Val Thr 11051110 1115 1120 Ile Tyr Ser Cys Gln Glu Leu Ser Pro His Val Trp Lys SerSer Ile 1125 1130 1135 Leu Glu Thr Pro Cys Glu Tyr Thr Asn Gln His SerLeu Leu Asn Ser 1140 1145 1150 His Ser His Ile Val Leu Thr Asp Ser GlnGly Thr Pro Arg Gly Phe 1155 1160 1165 Cys Leu Tyr Lys Ile Asn Gln GlnMet Val Lys Ser Leu Asn Lys His 1170 1175 1180 Gly Leu Lys Ala Ser ArgLeu Leu Arg Trp Arg Val Ser Pro Ser Pro 1185 1190 1195 1200 Cys Leu IleAsn Asp Cys Ile His Asn Ser Ser Lys Val Asp Gly Lys 1205 1210 1215 ValSer Ile Phe Gln Ile Arg Ala Tyr Tyr Gln Arg Leu Lys Asp Ile 1220 12251230 Pro Gly Pro Pro Ala Asn Ala Phe Trp Asn His Leu His Ile Gln Thr1235 1240 1245 His Arg Lys Gln Gln Arg Gly Asn Thr His Ala Ser Gln LysPro Phe 1250 1255 1260 Ile Pro Leu Gly Lys Gly Gly Arg Glu Gly Ser ProThr Leu Glu Lys 1265 1270 1275 1280 Pro Gly Leu His Arg Gln Gly Ser TyrCys Pro Leu Val Leu Ile Trp 1285 1290 1295 Glu Ser Lys Lys Val Tyr LeuVal Glu Thr Lys Pro Asn Ser Thr Lys 1300 1305 1310 Phe Phe Asn Tyr CysVal Ser Ala Ala Leu Val Gly Ala Trp Lys Thr 1315 1320 1325 Ser Ser GlyGlu Asp Ile Ala Leu Glu Leu Thr Ser Cys Gly Phe Ser 1330 1335 1340 PhePhe Phe Phe Phe Phe Phe Phe Leu Arg Gln Ser Leu Thr Val Ser 1345 13501355 1360 Pro Arg Leu Glu Cys Ile Phe Leu Cys Pro Thr Lys Thr His IlePro 1365 1370 1375 Ser Gln Leu Thr Ala Thr Ser Thr Ser Gln Val Gln GluMet Leu Leu 1380 1385 1390 Pro Pro Pro Lys Leu Gly Ser Gln Ala Cys AlaThr Thr Pro Ser Val 1395 1400 1405 Leu Tyr Phe Lys Arg Trp Gly Phe ThrMet Leu Ala Arg Leu Val Ser 1410 1415 1420 Asn Ser Pro Lys Val Ile HisLeu Pro Trp Leu Pro Lys Met Leu Asp 1425 1430 1435 1440 Tyr Arg Cys GluPro Leu His Leu Ala Ser Lys Ile Ser Ile Trp Gln 1445 1450 1455 Ile HisIle Ala Thr Phe Ile Leu Val Lys Ile Pro Lys Cys Phe His 1460 1465 1470Thr Ser Gln Lys Ala Thr Arg Asn Ser Ala Trp Thr Pro 1475 1480 1485 52298 DNA Homo sapiens 5 atgtttttta caatctcaag aaaaaatatg tcccagaaattgagtttact gttgcttgta 60 tttggactca tttggggatt gatgttactg cactatacttttcaacaacc aagacatcaa 120 agcagtgtca agttacgtga gcaaatacta gacttaagcaaaagatatgt taaagctcta 180 gcagaggaaa ataagaacac agtggatgtc gagaacggtgcttctatggc aggatatgcg 240 gatctgaaaa gaacaattgc tgtccttctg gatgacattttgcaacgatt ggtgaagctg 300 gagaacaaag ttgactatat tgttgtgaat ggctcagcagccaacaccac caatggtact 360 agtgggaatt tggtgccagt aaccacaaat aaaagaacgaatgtctcggg cagtatcagg 420 atagcagttg aaaatcacct tgtgctgctc catccactgtggattatatc ctatggcaga 480 aaagctttat attgctggct taggacagag gcaatactttacaataaaag cactaacgga 540 ggtcaagata aatgcgtttt tccaccgatc gacggttacccacactacga gggaaaaatt 600 aagtggataa atgacatgtg ccgttcggat ccgtgcaaggctcattatgg tatagatggg 660 tccagttgca ctttttttat atacctcagt gacgccgacaatcattgtcc ccatgcaccc 720 tggagacata aaaatcctta cgacgacgct gagcataattcatgcgctga aattcgtagt 780 gattttgaac ttctgtacag tgtgattcat cataaggacgagttccattt tatgagacta 840 cggagacggc gaatggttga gggatgggcc caaatcgcaaagtccctagc agataagcag 900 aacgcagaga agaaaaaacg gaaaaaggcc ctagttcacctgggaatcat taccaaggac 960 actgtatcta agattgctga aacaggtttc agtgccgcacctcttggtga cttagttcat 1020 tggagtgatg taattacatc tgcgtacgca gcggggcatgacgttaggat cactgcatca 1080 ctggctgagc tcaaggatgt cgtgaagaag attataggtaaccgatctgg ttgcccatct 1140 gtaggagaca gaattgttga gctactttac gctgatgtaattggactcgg tcaattcaag 1200 aaaactctag gtccaacctg ggctcaacat cggtggatggttcgagtcct tgaaactttt 1260 ggatcagatc ccgattttga acatgccaat tatgcgcaaacaaagggtca caagagccct 1320 tggggatggt ggaatctgaa ccctaataac ttttatacaatgttccccca tactccagaa 1380 aacacttttc ttgggtttgc gatcgagcag cacctaaactccagtgatat gcaccacctt 1440 aatgagatga agaggcagaa tcagacgctt gtgtatggcaaagtggatag cttctggaag 1500 aataagcata tttacttcga aatcattcac aattacatcgaagtgcaagc aactgtgtat 1560 gactcctcta cacccaatat tccctcttac tctcgaaaccacggtattct ttctggtcgg 1620 gaccatcgat tcctcctccg agagaccttc ttgttactaggactagggac tccttacgaa 1680 cgttgcgctc cgctggaagc catggcaaat cgatgcgtctttctcaaacc gaagttcccc 1740 ccacccaatt caaggaagaa tacagagttt ttacgaggcaagcccacctc cagagaggtg 1800 ttctcccagc atccctacgc ggagaacttc atcggcaagccccacgtgtg gacagtcgac 1860 tacaacaact cagaggagtt tgaagcagcc atcaaggccattatgagaac tcaggtagac 1920 ccctacctac cctacgagta cacctgcgag gggatgctggagcggatcac cgcctacatc 1980 cagcaccagg acttctgcag agcttcagaa cactgccacccacccagttt tataatccgc 2040 tccctctcca gggcaacccc acccaccagc ctaggcctgctcctccacct tccgggaggc 2100 agccccggga gctgggagct ggtggagggg ccaggctggacgcttcccgt gggagtcccc 2160 tccagacctg gtcggcccct gcagccacag aaccacgatggcaaaaaatc tatttgttct 2220 caaggactaa cctttggggg gaaagcaata gagacactctttttctctct ttttttaaag 2280 atttatttct ttaaataa 2298 6 765 PRT Homosapiens 6 Met Phe Phe Thr Ile Ser Arg Lys Asn Met Ser Gln Lys Leu SerLeu 1 5 10 15 Leu Leu Leu Val Phe Gly Leu Ile Trp Gly Leu Met Leu LeuHis Tyr 20 25 30 Thr Phe Gln Gln Pro Arg His Gln Ser Ser Val Lys Leu ArgGlu Gln 35 40 45 Ile Leu Asp Leu Ser Lys Arg Tyr Val Lys Ala Leu Ala GluGlu Asn 50 55 60 Lys Asn Thr Val Asp Val Glu Asn Gly Ala Ser Met Ala GlyTyr Ala 65 70 75 80 Asp Leu Lys Arg Thr Ile Ala Val Leu Leu Asp Asp IleLeu Gln Arg 85 90 95 Leu Val Lys Leu Glu Asn Lys Val Asp Tyr Ile Val ValAsn Gly Ser 100 105 110 Ala Ala Asn Thr Thr Asn Gly Thr Ser Gly Asn LeuVal Pro Val Thr 115 120 125 Thr Asn Lys Arg Thr Asn Val Ser Gly Ser IleArg Ile Ala Val Glu 130 135 140 Asn His Leu Val Leu Leu His Pro Leu TrpIle Ile Ser Tyr Gly Arg 145 150 155 160 Lys Ala Leu Tyr Cys Trp Leu ArgThr Glu Ala Ile Leu Tyr Asn Lys 165 170 175 Ser Thr Asn Gly Gly Gln AspLys Cys Val Phe Pro Pro Ile Asp Gly 180 185 190 Tyr Pro His Tyr Glu GlyLys Ile Lys Trp Ile Asn Asp Met Cys Arg 195 200 205 Ser Asp Pro Cys LysAla His Tyr Gly Ile Asp Gly Ser Ser Cys Thr 210 215 220 Phe Phe Ile TyrLeu Ser Asp Ala Asp Asn His Cys Pro His Ala Pro 225 230 235 240 Trp ArgHis Lys Asn Pro Tyr Asp Asp Ala Glu His Asn Ser Cys Ala 245 250 255 GluIle Arg Ser Asp Phe Glu Leu Leu Tyr Ser Val Ile His His Lys 260 265 270Asp Glu Phe His Phe Met Arg Leu Arg Arg Arg Arg Met Val Glu Gly 275 280285 Trp Ala Gln Ile Ala Lys Ser Leu Ala Asp Lys Gln Asn Ala Glu Lys 290295 300 Lys Lys Arg Lys Lys Ala Leu Val His Leu Gly Ile Ile Thr Lys Asp305 310 315 320 Thr Val Ser Lys Ile Ala Glu Thr Gly Phe Ser Ala Ala ProLeu Gly 325 330 335 Asp Leu Val His Trp Ser Asp Val Ile Thr Ser Ala TyrAla Ala Gly 340 345 350 His Asp Val Arg Ile Thr Ala Ser Leu Ala Glu LeuLys Asp Val Val 355 360 365 Lys Lys Ile Ile Gly Asn Arg Ser Gly Cys ProSer Val Gly Asp Arg 370 375 380 Ile Val Glu Leu Leu Tyr Ala Asp Val IleGly Leu Gly Gln Phe Lys 385 390 395 400 Lys Thr Leu Gly Pro Thr Trp AlaGln His Arg Trp Met Val Arg Val 405 410 415 Leu Glu Thr Phe Gly Ser AspPro Asp Phe Glu His Ala Asn Tyr Ala 420 425 430 Gln Thr Lys Gly His LysSer Pro Trp Gly Trp Trp Asn Leu Asn Pro 435 440 445 Asn Asn Phe Tyr ThrMet Phe Pro His Thr Pro Glu Asn Thr Phe Leu 450 455 460 Gly Phe Ala IleGlu Gln His Leu Asn Ser Ser Asp Met His His Leu 465 470 475 480 Asn GluMet Lys Arg Gln Asn Gln Thr Leu Val Tyr Gly Lys Val Asp 485 490 495 SerPhe Trp Lys Asn Lys His Ile Tyr Phe Glu Ile Ile His Asn Tyr 500 505 510Ile Glu Val Gln Ala Thr Val Tyr Asp Ser Ser Thr Pro Asn Ile Pro 515 520525 Ser Tyr Ser Arg Asn His Gly Ile Leu Ser Gly Arg Asp His Arg Phe 530535 540 Leu Leu Arg Glu Thr Phe Leu Leu Leu Gly Leu Gly Thr Pro Tyr Glu545 550 555 560 Arg Cys Ala Pro Leu Glu Ala Met Ala Asn Arg Cys Val PheLeu Lys 565 570 575 Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys Asn Thr GluPhe Leu Arg 580 585 590 Gly Lys Pro Thr Ser Arg Glu Val Phe Ser Gln HisPro Tyr Ala Glu 595 600 605 Asn Phe Ile Gly Lys Pro His Val Trp Thr ValAsp Tyr Asn Asn Ser 610 615 620 Glu Glu Phe Glu Ala Ala Ile Lys Ala IleMet Arg Thr Gln Val Asp 625 630 635 640 Pro Tyr Leu Pro Tyr Glu Tyr ThrCys Glu Gly Met Leu Glu Arg Ile 645 650 655 Thr Ala Tyr Ile Gln His GlnAsp Phe Cys Arg Ala Ser Glu His Cys 660 665 670 His Pro Pro Ser Phe IleIle Arg Ser Leu Ser Arg Ala Thr Pro Pro 675 680 685 Thr Ser Leu Gly LeuLeu Leu His Leu Pro Gly Gly Ser Pro Gly Ser 690 695 700 Trp Glu Leu ValGlu Gly Pro Gly Trp Thr Leu Pro Val Gly Val Pro 705 710 715 720 Ser ArgPro Gly Arg Pro Leu Gln Pro Gln Asn His Asp Gly Lys Lys 725 730 735 SerIle Cys Ser Gln Gly Leu Thr Phe Gly Gly Lys Ala Ile Glu Thr 740 745 750Leu Phe Phe Ser Leu Phe Leu Lys Ile Tyr Phe Phe Lys 755 760 765 7 948DNA Homo sapiens 7 cggctcttac cgcagcctga gtttcagcag ctgctgcgcaaggccaaact cttcctcggg 60 tttggcttcc cctacgaggg ccccgccccc ctggaggccatcgccaatgg ttgcatcttc 120 ctgcagtccc gcttcagccc gccccacagc tccctcaaccacgagttctt cccaggcaag 180 cccacctcca gagaggtgtt ctcccagcat ccctacgcggagaacttcat cggcaagccc 240 cacgtgtgga cagtcgacta caacaactca gaggagtttgaagcagccat caaggccatt 300 atgagaactc aggtagaccc ctacctaccc tacgagtacacctgcgaggg gatgctggag 360 cggatccacg cctacatcca gcaccaggac ttctgcagagctccagacca ctgccctacc 420 agaggcccac gccccgcaga gcccctttgt cctggcccccaatgccaccc acctcgagtg 480 ggctcggaac accagcttgg ctcctggggc ctggcccccgcgcacaccct gcgggcctgg 540 ctggccgtgc ctgggagggc ctgcaccgac acctgcctggaccacgggct aatctgtgag 600 ccctccttct tccccttcct gaacagccag gacgccttcctcaagctgca ggtgccctgt 660 gacagcaccg agtcggagat gaaccacctg tactctcggcgttcgcccag cctggccagg 720 agtgctacct gcagaaggag cctctgctct tcagtgcgccggctccaaca ccaagtaccg 780 ccggctctgc ccctgccgcg acttccgcaa gcggaattccggccggaatt ccggaattct 840 tttgcttttt acgagtcgag ttttttttct tttttttttcaagtcttgat ttgtggctta 900 cctcaagtta ccatttttca gtcaagtctg tttgtttgcttcttcaga 948 8 1295 DNA Homo sapiens 8 taaatatttt atttggatgt gaggtgcagaagagaaaaaa aaaaaaaaaa aaaaaagcgc 60 ggccgcaagc ttattccctt tagtgagggttaatttaaaa agcaaaagaa ttccggcctg 120 agctcagcta ggacagtgac tatttaatatagttaatgcc aggaactttc accccacgta 180 tggaagagtt caatcttaga gtagacaccttgtgaataca caaaccaaca ctcccttctg 240 aattctcatt cctagcacat tgtccttacagattcccagg ggacaccaag aggtttttgc 300 ctatataaaa ttaactagca acagtaaatggtgaagtcct aattaaataa gcatgggtta 360 aaagccagtc gtctgctaag atggtgaagggtgtccccat ccccatgttt aataaatgat 420 tgctgaatcc acaattcctc taaagttgatgggaaagttt ccatctttca gataagagca 480 tattatcaac ggttaaagga tatcccaggccctccagcaa atgccttctg gaatcatctc 540 cacattcaga cacatcgtaa acaacagaggggcaatactc atgcttcgca aaagccgttc 600 attccccttg gcaaaggcgg gagagagggctcaccaacct tggagaagcc tggtttacat 660 cgtcaaggta gctactgccc tctagtgttgatatgggaat aaagcaaaaa agtatacctg 720 gttgaaacga aaccgaactc cacaaagtttttcaattact gatgtgtctc agcagccttg 780 gtaggagctt ggaaaacatc atcaggtgaggatattgcac tggagctgac ctcttgtggc 840 ttctaaagtt tctttttttt tttttttttttttttgagac agagtctcac tgtgtcaccc 900 aggctggagt gcattttctt gtgtccaaccaagactcaca taccatctca gctcactgca 960 acctccacct cccaggttca agagatgctcctgccctagc ctcccaagta gctgggatca 1020 caggcatgtg ccaccacacc cagctaagttttgtattttt agaagagatg gggtttcacg 1080 atgttggcca gactggtctc gaactcctgacctaaagtga tccacctgcc ttggcttccc 1140 aaaatgctgg attacaggtg tgaaccactgcacctggcct ccaagatttc tatttggcaa 1200 attcacatag ctactttcat acttgttaaaataccgaaat gcttccatac cagttagcaa 1260 aaggccaccc ggaattcagc ttggacttaaccagg 1295 9 2298 DNA Homo sapiens 9 atgtttttta caatctcaag aaaaaatatgtcccagaaat tgagtttact gttgcttgta 60 tttggactca tttggggatt gatgttactgcactatactt ttcaacaacc aagacatcaa 120 agcagtgtca agttacgtga gcaaatactagacttaagca aaagatatgt taaagctcta 180 gcagaggaaa ataagaacac agtggatgtcgagaacggtg cttctatggc aggatatgcg 240 gatctgaaaa gaacaattgc tgtccttctggatgacattt tgcaacgatt ggtgaagctg 300 gagaacaaag ttgactatat tgttgtgaatggctcagcag ccaacaccac caatggtact 360 agtgggaatt tggtgccagt aaccacaaataaaagaacga atgtctcggg cagtatcagg 420 atagcagttg aaaatcacct tgtgctgctccatccactgt ggattatatc ctatggcaga 480 aaagctttat attgctggct taggacagaggcaatacttt acaataaaag cactaacgga 540 ggtcaagata aatgcgtttt tccaccgatcgacggttacc cacactacga gggaaaaatt 600 aagtggataa atgacatgtg ccgttcggatccgtgcaagg ctcattatgg tatagatggg 660 tccagttgca ctttttttat atacctcagtgacgccgaca atcattgtcc ccatgcaccc 720 tggagacata aaaatcctta cgacgacgctgagcataatt catgcgctga aattcgtagt 780 gattttgaac ttctgtacag tgtgattcatcataaggacg agttccattt tatgagacta 840 cggagacggc gaatggttga gggatgggcccaaatcgcaa agtccctagc agataagcag 900 aacgcagaga agaaaaaacg gaaaaaggccctagttcacc tgggaatcat taccaaggac 960 actgtatcta agattgctga aacaggtttcagtgccgcac ctcttggtga cttagttcat 1020 tggagtgatg taattacatc tgcgtacgcagcggggcatg acgttaggat cactgcatca 1080 ctggctgagc tcaaggatgt cgtgaagaagattataggta accgatctgg ttgcccatct 1140 gtaggagaca gaattgttga gctactttacgctgatgtaa ttggactcgg tcaattcaag 1200 aaaactctag gtccaacctg ggctcaacatcggtggatgg ttcgagtcct tgaaactttt 1260 ggatcagatc ccgattttga acatgccaattatgcgcaaa caaagggtca caagagccct 1320 tggggatggt ggaatctgaa ccctaataacttttatacaa tgttccccca tactccagaa 1380 aacacttttc ttgggtttgc gatcgagcagcacctaaact ccagtgatat gcaccacctt 1440 aatgagatga agaggcagaa tcagacgcttgtgtatggca aagtggatag cttctggaag 1500 aataagcata tttacttcga aatcattcacaattacatcg aagtgcaagc aactgtgtat 1560 gactcctcta cacccaatat tccctcttactctcgaaacc acggtattct ttctggtcgg 1620 gaccatcgat tcctcctccg agagaccttcttgttactag gactagggac tccttacgaa 1680 cgttgcgctc cgctggaagc catggcaaatcgatgcgtct ttctcaaacc gaagttcccc 1740 ccacccaatt caaggaagaa tacagagtttttacgaggca agcccacctc cagagaggtg 1800 ttctcccagc atccctacgc ggagaacttcatcggcaagc cccacgtgtg gacagtcgac 1860 tacaacaact cagaggagtt tgaagcagccatcaaggcca ttatgagaac tcaggtagac 1920 ccctacctac cctacgagta cacctgcgaggggatgctgg agcggatcac cgcctacatc 1980 cagcaccagg acttctgcag agcttcagaacactgccacc cacccagttt tataatccgc 2040 tccctctcca gggcaacccc acctttcccattccagggta acccgactac acggctaaga 2100 cttgttctac cgccgtttcc agaactagccgggccttgta gtcaccggaa ccaccccggg 2160 ggtaaaaaat tatattggtt ttctcgtactaatttatggg gtgaatctaa tcgtgatact 2220 ttatttttat ctttttttaa agatttatttttagaaatta ttaaatattt ttattgggat 2280 gttcgttgtc gtcgttaa 2298 10 765PRT Homo sapiens 10 Met Phe Phe Thr Ile Ser Arg Lys Asn Met Ser Gln LysLeu Ser Leu 1 5 10 15 Leu Leu Leu Val Phe Gly Leu Ile Trp Gly Leu MetLeu Leu His Tyr 20 25 30 Thr Phe Gln Gln Pro Arg His Gln Ser Ser Val LysLeu Arg Glu Gln 35 40 45 Ile Leu Asp Leu Ser Lys Arg Tyr Val Lys Ala LeuAla Glu Glu Asn 50 55 60 Lys Asn Thr Val Asp Val Glu Asn Gly Ala Ser MetAla Gly Tyr Ala 65 70 75 80 Asp Leu Lys Arg Thr Ile Ala Val Leu Leu AspAsp Ile Leu Gln Arg 85 90 95 Leu Val Lys Leu Glu Asn Lys Val Asp Tyr IleVal Val Asn Gly Ser 100 105 110 Ala Ala Asn Thr Thr Asn Gly Thr Ser GlyAsn Leu Val Pro Val Thr 115 120 125 Thr Asn Lys Arg Thr Asn Val Ser GlySer Ile Arg Ile Ala Val Glu 130 135 140 Asn His Leu Val Leu Leu His ProLeu Trp Ile Ile Ser Tyr Gly Arg 145 150 155 160 Lys Ala Leu Tyr Cys TrpLeu Arg Thr Glu Ala Ile Leu Tyr Asn Lys 165 170 175 Ser Thr Asn Gly GlyGln Asp Lys Cys Val Phe Pro Pro Ile Asp Gly 180 185 190 Tyr Pro His TyrGlu Gly Lys Ile Lys Trp Ile Asn Asp Met Cys Arg 195 200 205 Ser Asp ProCys Lys Ala His Tyr Gly Ile Asp Gly Ser Ser Cys Thr 210 215 220 Phe PheIle Tyr Leu Ser Asp Ala Asp Asn His Cys Pro His Ala Pro 225 230 235 240Trp Arg His Lys Asn Pro Tyr Asp Asp Ala Glu His Asn Ser Cys Ala 245 250255 Glu Ile Arg Ser Asp Phe Glu Leu Leu Tyr Ser Val Ile His His Lys 260265 270 Asp Glu Phe His Phe Met Arg Leu Arg Arg Arg Arg Met Val Glu Gly275 280 285 Trp Ala Gln Ile Ala Lys Ser Leu Ala Asp Lys Gln Asn Ala GluLys 290 295 300 Lys Lys Arg Lys Lys Ala Leu Val His Leu Gly Ile Ile ThrLys Asp 305 310 315 320 Thr Val Ser Lys Ile Ala Glu Thr Gly Phe Ser AlaAla Pro Leu Gly 325 330 335 Asp Leu Val His Trp Ser Asp Val Ile Thr SerAla Tyr Ala Ala Gly 340 345 350 His Asp Val Arg Ile Thr Ala Ser Leu AlaGlu Leu Lys Asp Val Val 355 360 365 Lys Lys Ile Ile Gly Asn Arg Ser GlyCys Pro Ser Val Gly Asp Arg 370 375 380 Ile Val Glu Leu Leu Tyr Ala AspVal Ile Gly Leu Gly Gln Phe Lys 385 390 395 400 Lys Thr Leu Gly Pro ThrTrp Ala Gln His Arg Trp Met Val Arg Val 405 410 415 Leu Glu Thr Phe GlySer Asp Pro Asp Phe Glu His Ala Asn Tyr Ala 420 425 430 Gln Thr Lys GlyHis Lys Ser Pro Trp Gly Trp Trp Asn Leu Asn Pro 435 440 445 Asn Asn PheTyr Thr Met Phe Pro His Thr Pro Glu Asn Thr Phe Leu 450 455 460 Gly PheAla Ile Glu Gln His Leu Asn Ser Ser Asp Met His His Leu 465 470 475 480Asn Glu Met Lys Arg Gln Asn Gln Thr Leu Val Tyr Gly Lys Val Asp 485 490495 Ser Phe Trp Lys Asn Lys His Ile Tyr Phe Glu Ile Ile His Asn Tyr 500505 510 Ile Glu Val Gln Ala Thr Val Tyr Asp Ser Ser Thr Pro Asn Ile Pro515 520 525 Ser Tyr Ser Arg Asn His Gly Ile Leu Ser Gly Arg Asp His ArgPhe 530 535 540 Leu Leu Arg Glu Thr Phe Leu Leu Leu Gly Leu Gly Thr ProTyr Glu 545 550 555 560 Arg Cys Ala Pro Leu Glu Ala Met Ala Asn Arg CysVal Phe Leu Lys 565 570 575 Pro Lys Phe Pro Pro Pro Asn Ser Arg Lys AsnThr Glu Phe Leu Arg 580 585 590 Gly Lys Pro Thr Ser Arg Glu Val Phe SerGln His Pro Tyr Ala Glu 595 600 605 Asn Phe Ile Gly Lys Pro His Val TrpThr Val Asp Tyr Asn Asn Ser 610 615 620 Glu Glu Phe Glu Ala Ala Ile LysAla Ile Met Arg Thr Gln Val Asp 625 630 635 640 Pro Tyr Leu Pro Tyr GluTyr Thr Cys Glu Gly Met Leu Glu Arg Ile 645 650 655 Thr Ala Tyr Ile GlnHis Gln Asp Phe Cys Arg Ala Ser Glu His Cys 660 665 670 His Pro Pro SerPhe Ile Ile Arg Ser Leu Ser Arg Ala Thr Pro Pro 675 680 685 Phe Pro PheGln Gly Asn Pro Thr Thr Arg Leu Arg Leu Val Leu Pro 690 695 700 Pro PhePro Glu Leu Ala Gly Pro Cys Ser His Arg Asn His Pro Gly 705 710 715 720Gly Lys Lys Leu Tyr Trp Phe Ser Arg Thr Asn Leu Trp Gly Glu Ser 725 730735 Asn Arg Asp Thr Leu Phe Leu Ser Phe Phe Lys Asp Leu Phe Leu Glu 740745 750 Ile Ile Lys Tyr Phe Tyr Trp Asp Val Arg Cys Arg Arg 755 760 76511 237 DNA Homo sapiens 11 cctttcccat tccagggtaa cccgactaca cggctaagacttgttctacc gccgtttcca 60 gaactagccg ggccttgtag tcaccggaac caccccgggggtaaaaaatt atattggttt 120 tctcgtacta atttatgggg tgaatctaat cgtgatactttatttttatc tttttttaaa 180 gatttatttt tagaaattat taaatatttt tattgggatgttcgttgtcg tcgttaa 237 12 78 PRT Homo sapiens 12 Pro Phe Pro Phe Gln GlyAsn Pro Thr Thr Arg Leu Arg Leu Val Leu 1 5 10 15 Pro Pro Phe Pro GluLeu Ala Gly Pro Cys Ser His Arg Asn His Pro 20 25 30 Gly Gly Lys Lys LeuTyr Trp Phe Ser Arg Thr Asn Leu Trp Gly Glu 35 40 45 Ser Asn Arg Asp ThrLeu Phe Leu Ser Phe Phe Lys Asp Leu Phe Leu 50 55 60 Glu Ile Ile Lys TyrPhe Tyr Trp Asp Val Arg Cys Arg Arg 65 70 75 13 28 DNA ArtificialSequence Description of Artificial Sequenceprimer 13 cagacctggtcggcccctgc agccacag 28 14 24 DNA Artificial Sequence Description ofArtificial Sequenceprimer 14 ggaggcagcc ccgggagctg ggag 24 15 30 DNAArtificial Sequence Description of Artificial Sequenceprimer 15ggtcaagata aatgcgtttt tccaccgatc 30 16 34 DNA Artificial SequenceDescription of Artificial Sequenceprimer 16 gtggattata tcctatggcagaaaagcttt atat 34

We claim:
 1. An isolated GlcNAc-TV-b or GlcNAc-TV-c nucleic acidmolecule of at least 30 nucleotides which hybridizes to SEQ ID NO. 1 orthe complement of SEQ ID NO. 1, under stringent hybridizationconditions.
 2. An isolated GlcNAc-TV-b or GlcNAc-TV-c nucleic acidmolecule which comprises: (i) a nucleic acid sequence encoding a proteinhaving substantial sequence identity preferably at least 70%, morepreferably at least 75% sequence identity, with an amino acid sequenceof SEQ. ID. NO. 2, 4, 6, 10, or 12; (ii) nucleic acid sequencescomplementary to (i); (iii) nucleic acid sequences differing from any ofthe nucleic acids of (i) or (ii) in codon sequences due to thedegeneracy of the genetic code; (iv) a nucleic acid sequence comprisingat least 18 nucleotides and capable of hybridizing under stringentconditions to a nucleic acid sequence of SEQ. ID. NO. 1, 3, 5, 9, or 11or to a degenerate form thereof; (v) a nucleic acid sequence encoding atruncation, an analog, an allelic or species variation of a proteincomprising an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12; or(vi) a fragment, or allelic or species variation of (i), (ii) or (iii)3. An isolated nucleic acid GlcNAc-TV-b or GlcNAc-TV-c nucleic acidmolecule which comprises: (i) nucleic acid sequence having substantialsequence identity preferably at least 70%, more preferably at least 75%sequence identity with a nucleotide sequence of SEQ. ID. NO. 1, 3, 5, 9,or 11; (ii) nucleic acid sequences complementary to (i), preferablycomplementary to a full nucleic acid sequence of SEQ. ID. NO. 1, 3, 5,9, or 11; (iii) nucleic acid sequences differing from any of the nucleicacids of (i) to (ii) in codon sequences due to the degeneracy of thegenetic code; or (iv) a fragment, or allelic or species variation of(i), (ii) or (iii).
 4. An isolated nucleic acid molecule which encodes aprotein which binds an antibody of a GlcNAc-TV-b or GlcNAc-TV-c protein.5. An isolated nucleic acid molecule as claimed in any of the precedingclaims fused to a nucleic acid which encodes a heterologous protein. 6.A vector comprising a nucleic acid molecule of any of the precedingclaims.
 7. A host cell comprising a nucleic acid molecule of any of thepreceding claims.
 8. An isolated GlcNAc-TV-b or GlcNAc-TV-c proteincomprising an amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or
 12. 9.An isolated protein having at least 70% amino acid sequence identity toan amino acid sequence of SEQ. ID. NO. 2, 4, 6, 10, or
 12. 10. A methodfor preparing a protein as claimed in claim 8 comprising: (a)transferring a vector as claimed in claim 6 into a host cell; (b)selecting transformed host cells from untransformed host cells; (c)culturing a selected transformed host cell under conditions which allowexpression of the protein; and (d) isolating the protein.
 11. A proteinprepared in accordance with the method of claim
 10. 12. An antibodyhaving specificity against an epitope of a protein as claimed in claim8.
 13. An antibody as claimed in claim 12 labeled with a detectablesubstance and used to detect the protein in biological samples, tissues,and cells.
 14. A probe comprising a sequence encoding a protein asclaimed in claim 8, or a part thereof.
 15. A method of diagnosing andmonitoring conditions mediated by a protein as claimed in claim 8 bydetermining the presence of a nucleic acid molecule as claimed in any ofthe preceding claims or a protein as claimed in any of the precedingclaims.
 16. A method for identifying a substance which associates with aprotein as claimed in claim 8 comprising (a) reacting the protein withat least one substance which potentially can associate with the protein,under conditions which permit the association between the substance andprotein, and (b) removing or detecting protein associated with thesubstance, wherein detection of associated protein and substanceindicates the substance associates with the protein.
 17. A method asclaimed in claim 16 wherein association of the protein with thesubstance is detected by assaying for substance-protein complexes, forfree substance, for non-complexed protein, or for activation of theprotein.
 18. A method for evaluating a compound for its ability tomodulate the biological activity of a protein as claimed in claim 8comprising providing a known concentration of the protein with asubstance which associates with the protein and a test compound underconditions which permit the formation of complexes between the substanceand protein, and removing and/or detecting complexes.
 19. A method fordetecting a nucleic acid molecule encoding a protein comprising an aminoacid sequence of SEQ. ID. NO. 2, 4, 6, 10, or 12 in a biological samplecomprising the steps of: (a) hybridizing the nucleic acid molecule ofclaim 1 to nucleic acids of the biological sample, thereby forming ahybridization complex; and (b) detecting the hybridization complexwherein the presence of the hybridization complex correlates with thepresence of a nucleic acid molecule encoding the protein in thebiological sample.
 20. A method as claimed in claim 19 wherein nucleicacids of the biological sample are amplified by the polymerase chainreaction prior to the hybridizing step.
 21. A method for treating acondition mediated by a protein as claimed in claim 8 comprisingadministering an effective amount of an antibody as claimed in claim 12or a substance or compound identified in accordance with a methodclaimed in claim 16 or claim
 18. 22. A composition comprising one ormore of a nucleic acid molecule or protein claimed in any of thepreceding claims, or a substance or compound identified using a methodas claimed in any of the preceding claims, and a pharmaceuticallyacceptable carrier, excipient or diluent.
 23. Use of one or more of anucleic acid molecule or protein claimed in any of the preceding claims,or a substance or compound identified using a method as claimed in anyof the preceding claims in the preparation of a pharmaceuticalcomposition for treating a condition mediated by a protein as claimed inclaim
 8. 24. A gene-based therapy directed at the brain comprising apolynucleotide comprising all or a portion of a regulatory sequence ofSEQ. ID. NO. 7 or
 8. 25. A method for preparing an oligosaccharidecomprising contacting a reaction mixture comprising an activated GlcNAc,and an acceptor in the presence of a protein as claimed in claim 8 or 9.